Artificial Developer Intelligence
Is GPT-5.5 finally a 4.7-tier model? Did DeepSeek V4 just close the gap with Anthropic? And what does it mean that a senior ML engineer says he can't out-code Claude anymore? Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav are joined by special guest Nathan Lubchenco — ML engineer and Substack author of *The future was yesterday* (https://nathanlubchenco.substack.com/) — on ADI Pod #23 (April 28, 2026). This episode covers OpenAI's GPT-5.5 release, DeepSeek V4 (1.6T base / 49B active params with 1M context), Meta's new Model Capability Initiative tracking US employee keystrokes and mouse movements, a Levenshtein-distance study on coding-model over-editing, the 2026 Stanford AI Index report, and a deep-dive interview on how to hire software engineers when the agents are already better at coding than the candidates. Key takeaways — Models are now consistently better at coding than even senior ML engineers, by their own admission. Late-2026 may be when they cross the median software engineer. — Coding-model over-editing is measurable (Levenshtein distance on boolean-flip tasks) and instruction-followable — explicit "minimum-edit" prompts close most of the gap. — The US is unusually a slow adopter of a major technological wave. Workplace AI usage is highest in emerging economies, not the developed world. — "The task is not the job" — humans remain indispensable on the bundling dimensions: catching what customers don't say, and avoiding interactions that end up on social media. — Software engineering interviews should include the candidate's personal harness, with company-provided API keys for equity. LeetCode optimizes for the wrong signal in 2026. — DeepSeek V4 closing the gap with Mythos in 3–6 months is what makes the bubble too geopolitically important to fail. Chapters * (00:00) - Cold Open & Welcome * (01:31) - News Threadmill: GPT-5.5, DeepSeek V4, Meta Watches Every Keystroke * (12:28) - Post-Processing: Coding Models Are Doing Too Much * (18:59) - Post-Processing: The Task Is Not the Job (Luis Garicano) * (32:20) - Post-Processing: The 2026 Stanford AI Index Report * (38:11) - Deep Dive: Interviewing Engineers in the AI Age (with Nathan Lubchenco) * (45:05) - Deep Dive: Reforming Software Hiring — Take-Homes, Personal Harness, Equity * (50:15) - Deep Dive: When Models Cross the Median Engineer (Late-2026 Prediction) * (59:29) - Deep Dive: Why Code Review Is the Current Bottleneck * (01:00:21) - Deep Dive: Should PRs Show the Prompt History? * (01:02:27) - Dan's Rant: Anthropic Tested Removing Claude Code from the Pro Plan * (01:05:44) - Rahul's Rampage: The Infinity Machine — Demis Hassabis & Corporate Gravity * (01:14:32) - Two Minutes to Midnight: Bubble Clock Moves Back to 4:00 * (01:26:30) - Outro Resources mentioned **Models & news** • OpenAI — Introducing GPT-5.5: https://openai.com/index/introducing-gpt-5-5/ • Engadget — DeepSeek promises its new AI model has world-class reasoning: https://www.engadget.com/ai/deepseek-promises-its-new-ai-model-has-world-class-reasoning-115733512.html • Reuters — Meta to start capturing employee mouse movements, keystrokes for AI training data: https://www.reuters.com/sustainability/boards-policy-regulation/meta-start-capturing-employee-mouse-movements-keystrokes-ai-training-data-2026-04-21/ **Post-processing articles** • "Coding Models Are Doing Too Much" — Levenshtein-distance over-editing study (nrehiew): https://nrehiew.github.io/blog/minimal_editing/ • Luis Garicano (Silicon Continent) — Why Desk Jobs Survive ("The task is not the job"): https://www.siliconcontinent.com/p/why-desk-jobs-survive-and-amodei • 2026 AI Index Report — Stanford Institute for Human-Centered AI: https://hai.stanford.edu/ai-index/2026-ai-index-report **Deep dive** • Nathan Lubchenco — Interviewing Software Engineers in the Age of AI: https://nathanlubchenco.substack.com/p/interviewing-software-engineers-in • Nathan Lubchenco — *The future was yesterday* Substack home: https://nathanlubchenco.substack.com/ **Dan's rant** • Ars Technica — Anthropic tested removing Claude Code from the Pro plan: https://arstechnica.com/ai/2026/04/anthropic-tested-removing-claude-code-from-the-pro-plan/ **Rahul's rampage** • Sebastian Mallaby — *The Infinity Machine* (book on Demis Hassabis and DeepMind) • Philipp Dubach — Do Not Disturb My Circles (Archimedes essay): https://philippdubach.com/posts/do-not-disturb-my-circles/ **Bubble watch** • TechCrunch — Two college kids raise $5.1M pre-seed to build an AI social network in iMessage: https://techcrunch.com/2026/04/24/two-college-kids-raise-a-5-1-million-pre-seed-to-build-an-ai-social-network-in-imessage/ • Toby Ord — Hourly Costs for AI Agents: https://www.tobyord.com/writing/hourly-costs-for-ai-agents • CNBC — OpenAI reportedly missed revenue targets, shares of Oracle and chip stocks falling: https://www.cnbc.com/2026/04/28/openai-reportedly-missed-revenue-targets-shares-of-oracle-and-these-chip-stocks-are-falling.html About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav go through hundreds of links and dozens of newsletters every week so you don't have to. This week's special guest: **Nathan Lubchenco** — ML engineer and author of *The future was yesterday* on Substack, where he writes about AI and software engineering. • Website: https://www.adipod.ai • Email: humans@adipod.ai
26 episoder
Kommentarer
0Vær den første til at kommentere
Tilmeld dig nu og bliv en del af Artificial Developer Intelligence-fællesskabet!