Cover image of show The Bikeshed Pod

The Bikeshed Pod

Podcast by Matt Hamlin, Dillon Curry & Scott Kaye

English

Technology & science

Limited Offer

2 months for 19 kr.

Then 99 kr. / monthCancel anytime.

  • 20 hours of audiobooks / month
  • Podcasts only on Podimo
  • All free podcasts
Get Started

About The Bikeshed Pod

The Bikeshed Pod is a weekly show where developers dive deep into the small but important details of software development that we all love to debate.

All episodes

27 episodes

episode Plan Mode Sucks artwork

Plan Mode Sucks

EPISODE SUMMARY: Matt revisits a hot take from a year ago that he believes more strongly now: you shouldn't be using plan mode in your AI coding agents. The conversation lands on a more nuanced position — planning still matters, plan mode just isn't the right tool for it anymore. THE CASE AGAINST PLAN MODE Matt argues that plan mode in Claude Code and Codex has degraded over the past month or so. Where it used to ask a ton of clarifying questions, it now spins for 30 minutes and hands back a full markdown plan without ever pinging you for context. The on-rails experience has stopped doing the part that made it valuable. Scott pushes back gently: plan mode still has a place, especially for big architectural changes where one-shotting will leave you with a context provider sprinkled across multiple files (his real example from the previous Friday). But he agrees the out-of-the-box version isn't the only way to plan, and often isn't the best one. PLANNING ≠ PLAN MODE The real takeaway: planning the activity is still incredibly valuable. Plan mode the feature is just one — increasingly mediocre — implementation of it. The crew walks through the alternatives they're actually reaching for: THE GRILL ME SKILL Matt surfaces the grill-me skill Matt Pocock shared (and Dillon dropped in the Discord): a one-line skill that tells the agent to keep asking questions until it actually understands what you're trying to build. Strong fit for feature work where you don't yet know the shape of the problem space. POC-FIRST DEVELOPMENT Dillon describes his current workflow on a big work project: POC the entire user flow first, then POC each piece of the flow before building anything for real. He's been using Superpowers (the most popular Claude Code skill) and its brainstorming sub-skill, which builds mock interfaces so you can compare options. He'd rather over-plan than have to tell a coworker "Claude thought it was a good idea" when they ask why something works the way it does. THE PLAN / BUILD SPECTRUM Matt frames plan mode and build mode as two ends of a spectrum where you actually want to land somewhere in the middle — exploring three or four ideas, spinning off agents to POC each, bringing findings back, iterating. He hasn't found a skill that nails this loop yet, and invites listeners with a working setup to share it in the Discord. PLAN MODE IS STILL GREAT FOR NEWCOMERS Dillon's softer take: plan mode is genuinely useful when you're new to agentic tooling. It gives you a clear default workflow before you know what you actually need. You grow out of it as you discover the specific checks — codebase exploration, TDD, edge-case enumeration — you want before any code gets written. ASK FOR SOURCES Dillon's quick aside: when you're using the agent to learn something, ask it for sources. It'll mash concepts together, and being able to cross-check against the actual docs catches the seams. STANDUP / LIFE UPDATES * Dillon spent the week in northwest Arkansas (Fayetteville and Bentonville) for his brother's birthday and a baby shower for his niece, due in June. Bentonville was a surprise — Walmart HQ has turned the area into a brand-new, Apple-campus-tier hub since the company required vendors to relocate post-COVID. Cost of living roughly half of Boston, Onyx Coffee on the ground. He looked at the Walmart careers page. * Scott completed his eighth powerlifting meet in eight years (with a two-year, two-month injury gap): 597.5 kg total at 82.5 kg bodyweight — 200 squat, 140 bench, 257.5 deadlift. That's a 1,317 lb total. He's 85 kg from the national qualifying total, or he turns 40 first and qualifies on the masters total he's already cleared. Also teasing some open source work he's "boiling down to a usable small chunk" — ETA four or five years per Scott; "what model are you using?" per Dillon. * Matt has been ping-ponging across his personal tooling: gave up vibe-coding his own note-taking app, gave Notion another try, set up scheduled Claude Code tasks to summarize recent notes, and switched from Arc to Chrome specifically for the Claude Chrome extension and Cowork. Recording from a HubSpot meeting room during an onsite, with bachelor party planning in May wrapping up. And in the loudest beat of the segment: he had the agent build a library blending CRDTs, offline-first, RSC, and server actions — pulling from auto-merge, YJS, and Tanstack Query — and it dutifully reinvented Tanstack Query.

9 May 2026 - 34 min
episode Vibe-Coding Your Own Productivity Stack artwork

Vibe-Coding Your Own Productivity Stack

EPISODE SUMMARY In this episode, Dillon walks Scott and Matt through a personal productivity dashboard he's been building with Claude — and uses it as a jumping-off point for a wider conversation about what AI unlocks for "personal software." THE DAILY BRIEFING DASHBOARD Dillon's dashboard started as a joke: use Claude's new cron feature to post a daily inspirational quote at 9 a.m. He quickly realized he could put something genuinely useful there instead. The result is a single page he opens every morning that surfaces: * To-dos * Open PRs * JIRA tickets * Datadog alerts * Summaries of recent notes * A custom Kanban board for tracking dev "harnesses" (scoping → planning → execution → review) It's intentionally simple under the hood: zero dependencies, a Python server, HTML, and CSS. make start and you're running. He's burning roughly $2,500/mo in Claude tokens building it, has shared it openly with leadership and the broader company, and treats it as a sandbox for trying anything new in AI. WHY BUILD IT YOURSELF? Matt frames the bigger thesis: AI is a fast track to personal software — the small niche of building a tool tuned exactly to your own workflow rather than adopting something off the shelf or solving for millions of users. The closest off-the-shelf comparison would be something like Notion or Dream.ai, but neither would match Dillon's specific data sources or the way he wants to see them. WHERE AI SURPRISED HIM (GOOD AND BAD) * Struggles with UI consistency. AI gets the functionality right, but drifts from the design system, makes spacing and layout mistakes, and occasionally tries to "helpfully" refactor onto a totally different stack (e.g. "let's add SQLite") mid-project. Dillon's mitigation: keep it simple, have Claude audit the UI and write its own lightweight design system, and push reminders into CLAUDE.md. * Matt's tip: expose a route on your app that renders all components on one page (poor man's Storybook) so the agent can discover existing patterns. * Unexpected win: visual thinking. Dillon's been asking Claude to generate HTML pages with architecture diagrams, user flows, and dependency maps to build a mental model of unfamiliar projects before diving in. Matt does the same to navigate his monorepo's package dependency graph. SKILLS DILLON HAS BUILT * Start of Day / End of Day — a paired skill that asks reflection questions in the evening and gives him a standup-style recap in the morning, including "what's on my radar that I'm not thinking about." * PR Status / PR Watch — pulls GitHub check status, surfaces comments, and runs every five minutes to send a Mac notification when a PR is ready to merge. * Mind Dump — partner skill to End of Day that takes a stream of consciousness and organizes it into a structured markdown doc. * Contentful skills — connect to the CMS API to pull content types, explain how they work, and (experimentally) architect new ones. * The "Grill Me" skill (borrowed from Matt Pocock) — has Claude slowly ask questions about a plan to surface edge cases and force thinking through the problem. His meta-tip: every time you use a skill, reflect on how it did and ask Claude to improve it. THE PRODUCTIVITY PARADOX Has it actually made him more productive? Yes — but the new problem is spreading too thin. Dillon shipped 14 PRs in a week and now has 20 open ones he can't get back to. As Matt jokes: "the trick is to go faster." The real discipline is cleaning up after yourself, slowing down, and focusing on one thing at a time, even when you have 12 work trees open. CAVEATS AND TAKEAWAYS * This is personal software — running locally, no deploy target, code quality is intentionally rough. Not how Dillon does actual work. * A lot of devs at his company are afraid to build things outside their tickets. Dillon's been transparent with leadership and turned it into a shared resource instead. * If you want to start: literally talk to Claude (or sketch a screenshot) about what you'd want to see every morning, and go from there. TEASERS Future episode ideas raised: how to get good UI out of agents, and using AI to onboard yourself onto an unfamiliar codebase. Scott also hints he has a "pretty good solution" for the UI consistency problem — saving that for another episode.

9 May 2026 - 34 min
episode No(de.js) AI artwork

No(de.js) AI

THE 19,000-LINE SLOPFORK: NODE.JS, CLAUDE CODE, AND THE AI CONTRIBUTION CRISIS Matt, Scott, and Dillon unpack one of the messiest open source dramas of the moment: Matteo Collina — Node TSC member and Fastify creator — dropped a ~19,000-line PR on Node.js core over Christmas break, openly built with the help of Claude Code. The PR adds long-requested virtual file system support, intercepting 164+ points across fs, fs/promises, and the module loading system. Over half the diff is tests, which is part of why it raised eyebrows in the first place — that volume of integration tests is something a human contributor likely wouldn't have written by hand. THE DCO QUESTION The crew dig into the Developer's Certificate of Origin (DCO) and whether agent-generated code cleanly satisfies it. Does Claude-written code count as "authored by you"? It's still a foggy question, and one contributor was rattled enough to start a petition to ban AI-generated code from Node.js core. Matteo's response: I made all the decisions, I fixed the AI's mistakes, it's still my code. PROCESS, NOT JUST AI Scott's take is that the size and abruptness are doing as much damage as the AI angle. There were existing issues discussing a VFS, but no RFC, no upfront tech plan, and the commit history is borderline unreviewable. Classic "easier to ask forgiveness than permission" energy — but on a change that touches a major surface of the runtime. The crew sympathize with the engineer's instinct to just ship the thing, but agree that a feature this big needed buy-in first. (Scott would have left a nit: comment asking for a rebase to a single commit.) HOW DO YOU EVEN POLICE THIS? Dillon raises the obvious enforcement problem: AI detection tools have the same false-positive issues that plague universities. A one-line bug fix is indistinguishable from a human's. That points toward either accepting AI-assisted contributions outright or building entirely new governance — which is roughly where the broader OSS community seems to be landing (the related issue was reportedly closed with consensus that AI-assisted dev is allowed). WHAT IF NODE SAYS NO? Matt poses the strategic question: if Node.js bans AI contributions, does that hand momentum to Bun and Deno? Bun is already leaning hard into Claude-assisted development, ships features fast (native SQLite being the canonical example), and operates as a company rather than a committee — so it has structural advantages on velocity and backwards-compat tradeoffs. Scott pushes back that big corporations are slow to migrate runtimes regardless, but Matt counters that agents dramatically lower switching costs — point Claude at your codebase and say "migrate this from Node to Bun" and it's plausibly a weekend. SLOPFORKS AND THE SQLITE PLAYBOOK The conversation widens to Cloudflare's "vinext" — a Vite-based Next.js reimplementation built by pointing an agent at Next's test suite, which popularized the term slopfork. That sparked talk of TLDraw considering closing their test suite to prevent agent-driven reimplementation, and the long-standing SQLite model where the code is open source but the comprehensive test suite is paid/closed. Expect more projects to consider that pattern as agents make test-suite-driven reimplementation trivially cheap. CLOSING TAKES * Open source projects may need to lean into AI just to stay competitive with company-backed runtimes. * The irony: another Node contributor used Claude to write a deep-dive review of the PR itself. * Matteo also published a userland polyfill on npm, hedging against Node's slow merge process. * Scott's verdict: merge it already. Plus a brief detour on the iojs fork of yore, and Matt's proposed name for the inevitable Node slopfork: input-output.js.

14 Apr 2026 - 31 min
episode Selling The Increment: PR Scope, Nitpicks, and Token Anxiety artwork

Selling The Increment: PR Scope, Nitpicks, and Token Anxiety

MAIN TOPIC: WHEN PR FEEDBACK DIVERGES FROM THE PLAN Matt and Scott both recently shipped PRs and got reviewer feedback that didn't match what they thought the team had agreed on — either asking for far more scope than intended, or pushing back on an incremental approach in favor of the "ideal" final solution. SCOPE CREEP IN REVIEW The classic example: you make one focused change in an old codebase, and a reviewer points to an unrelated eslint-disable comment five lines above your diff and asks you to "fix it properly." Suddenly your tight PR is fighting a battle you didn't sign up for. INCREMENTAL VS. IDEAL Scott frames the core tension: most of the team agrees with incremental PRs in principle, but in practice a reviewer can blindside you by insisting the PR get all the way to the ideal end-state. The skill is selling the increment — showing the path from "value shipped today" to "ideal solution later" so reviewers buy into the staging, not just the destination. Scott's concrete example: he was building end-to-end smoke tests where the ideal version was blocked by another team. He had to plead his case, point to follow-up tickets, and frame the smoke tests as immediate value on the way to the real thing. TRUST, TEAMS, AND CONTEXT Dillon notes he doesn't hit this much — his team has jelled, they tell each other up front "you're not going to like this, let's talk in person." Scott contrasts that with newer teams or strong-opinioned teammates where trust hasn't been built and standups go in one ear, out the other. PLANS AS THE THROUGH-LINE The group converges on communicate, communicate, communicate. Matt argues for creating a clear through-line — an issue or doc that breaks the work into steps so a reviewer landing on PR #3 can click back to the plan. It doesn't have to be a senior-staff-signoff design doc; even a one-pager or a markdown plan from Claude counts. Matt also pitches "shift left" on reviews: get eyes on the plan before the PR, not at PR time. Dillon introduces the PRD framing — product requirements doc owned by PMs, separate from an engineering design doc. When you don't have a dedicated PM, you have to own that artifact yourself, and teams sometimes forget. NITPICKS AND CODE REVIEW CULTURE Matt's open question: how do you stop blocking nitpick comments from dominating? Scott's stance is firm — nitpicks should not block PRs. Unless there's a real architectural problem or a literal bug, call it out, approve anyway, and let the author decide whether to address it now or as a follow-up. Code review is a two-way conversation, not a one-way directive, regardless of seniority. Dillon's tactics: mark nitpicks as out-of-scope with a follow-up ticket, or just take the conversation to Slack/in-person to break through faster than async GitHub threads. A frustrating wrinkle Matt raises: their SOX-required approval process dismisses approvals on any new push, so even fixing a non-blocking comment forces a re-review cycle — actively discouraging the fast-follow behavior everyone says they want. The underlying message: assume good intent, trust your fellow engineers, and recognize that increasingly you may be reviewing a Claude-generated PR anyway. STAND-UP * Dillon: Witnessed comedic whiplash at work — one presenter said "turn on bypass permissions and use Opus for everything," and an engineering leader said the exact opposite an hour later. Has been vibe-coding a personal dashboard with Opus and burned $300 in four hours, leading him to discover "token anxiety." Can't even relax on the couch wondering if the agent is deleting his computer. * Scott: Powerlifting meet next Sunday, body hurts. Saw Project Hail Mary, loved it, then proceeded to spoil it on-mic. * Matt: Going to see Project Hail Mary tonight (thanks Scott). Otherwise enjoying being back in the metaphorical (and literal) booth.

14 Apr 2026 - 40 min
episode Retro & React - 3 artwork

Retro & React - 3

THE OPENCLAW SAGA: REBRANDS, MALWARE, AND WHO PAYS FOR YOUR AGENT Matt, Scott, and Dillon try to make sense of one of the fastest-moving stories in agentic AI: the project that started as ClawdBot, briefly became MoldBot for about three hours, and now goes by OpenClaw — at least until the next rebrand. WHAT EVEN IS OPENCLAW? The crew sets the stage for Dillon (and any listener who's been blissfully out of the loop). OpenClaw is an always-on, multi-agent autonomy platform — think "proto-Jarvis" or "Siri but 100x." You hand it your calendar, email, Slack, messaging apps, and home automation, and it goes off and does the menial digital work for you. Matt notes its GitHub star growth has outpaced essentially every project ever, though the guys speculate how much of that is real humans vs. agents spinning up GitHub accounts and starring the repo themselves. THE MARKETPLACE MALWARE SCANDAL Scott opens the episode with the bombshell: the #1 most-downloaded skill in the OpenClaw marketplace was malware — stealing SSH keys, crypto wallets, and browser cookies, and opening a reverse shell to the attacker's server. 1,184 malicious skills were found in total, with one attacker responsible for 677 packages. The conversation turns to the obvious tension: the more credentials and access you give an always-on agent, the more catastrophic a supply-chain attack on its plugin ecosystem becomes. THE REBRAND CAROUSEL AND THE DRAMA The hosts walk through the dizzying pace of changes — naming, ownership questions, and the broader "is this a scam?" vibes swirling around the project. Matt points out the loop is so tight that anything they say will probably be outdated by the time the episode drops. ANTHROPIC PULLS THE PLUG ON SUBSCRIPTION BILLING A big thread: Matt wanted to run his own OpenClaw agent but is held back because he can't point it at his Claude subscription anymore. The crew theorizes why — consumer subscriptions are almost certainly loss-leaders for the model providers, while per-token API billing is where the margin lives. Shutting subscription access off for agents that burn tokens 24/7 is basically self-defense. TOKENS, TOKENS, TOKENS Dillon makes the point that the model companies probably love OpenClaw in principle: it's a perfect machine for getting customers to burn tokens faster. He also shares that he recently got a gentle slap on the wrist at work for always reaching for the most expensive model. Scott admits he never swaps models either. The guys riff on the dystopia of paying big AI companies to build agents that do work for other big companies — while no one's handing us the robot. WRAPPING UP They close with predictions that the ClawdBot/OpenClaw competition between model providers will keep producing new shiny things until someone gets acquired — and a half-joking suggestion that listeners should one-shot a startup idea on Saturday afternoon and become billionaires by Monday.

11 Apr 2026 - 41 min
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍
Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription

Most popular

Limited Offer

Premium

20 hours of audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

2 months for 19 kr.
Then 99 kr. / month

Get Started

Premium Plus

Unlimited audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

Start 7 days free trial
Then 129 kr. / month

Start for free

Only on Podimo

Popular audiobooks

Get Started

2 months for 19 kr. Then 99 kr. / month. Cancel anytime.