Artificial Developer Intelligence
Is Claude Opus 4.7 really burning tokens? Is open source dead after mythos? Co-hosts Shimin Zhang and Dan Lasky — with recurring guest Rahul Yadav — ran the experiments this week on ADI Pod #22 (April 21, 2026). This episode covers Anthropic's Claude Opus 4.7 release (the "mythos slice"), Alibaba's open-source Qwen 3.6 35B A3B, cal.com going closed source for security reasons, and a HIPAA-violating vibe-coded patient portal that is, in Dan's words, the bullshit future already here. In this episode ▸ **Claude Opus 4.7 review** — the new mythos-derived tokenizer (3× bloat on plain English), stricter instruction-following, and why Shimin's SVG experiments suggest the token-burn panic is overblown: 35¢ on Opus 4.7 vs $2 on Opus 4.6 for the same task, with ~40× fewer reasoning tokens. ▸ **Qwen 3.6 35B A3B** — Alibaba's open-source mixture-of-experts model (3B active params at any time) running locally on Shimin's laptop at 90–95 tokens/sec via llama.cpp + Unsloth. The first model to break Simon Willison's pelican-on-a-bicycle benchmark against a larger frontier model. ▸ **cal.com goes closed source** — why the AI Security Institute's $12,000-per-attempt mythos pentesting data ($125,000 for 10 runs) is changing the open-source calculus, and Drew Breunig's three-phase dev/review/hardening cycle prediction. ▸ **Jesse Vincent's "Rules and Gates"** — a coding-agent prompting technique that reformulates optional preferences into directed preconditions, and whether agents can "weasel out" by rewriting the gate itself. ▸ **AI vibe coding horror story** — a German doctor who inlined a full patient portal into a single HTML page with database credentials client-side. HIPAA, meet DSGVO. ▸ **Kyle Kingsbury's "The Future of Everything is Lies"** — the Jepsen author's 8-step action list on AI's second- and third-order societal effects. ▸ **The AI-on-AI Arena** — Shimin's weekend project grading 11 frontier models against each other. The "delusion index" reads almost exactly like Dunning-Kruger in humans: GPT-5.4 scored -1.6 (humble), Gemini 3.1 Pro Preview rated itself well while peers ranked it last. ▸ **Two Minutes to Midnight** — Paul Graham's log-scale chart comparing AI capex (~1% of US GDP) to the US railroad peak (~10%). We dialed the AI bubble clock back 45 seconds to 3 min 30 sec. Key takeaways — Opus 4.7's token-burn reputation may be overblown. Stricter instruction-following can reduce total reasoning tokens by up to 40× vs Opus 4.6 on the same task. — Security-driven closed-sourcing may spread as mythos-class agents make open repos easier to exploit. Hardening could make software capital-intensive again. — Cognitive debt is real: Dan's wake-up call was a production bug a pre-LLM colleague solved in 5 minutes. His first instinct was to double down on the tool. — Shimin's defense against skill atrophy: read 100% of LLM-generated PR lines (except tests). — Weaker models rate themselves higher than stronger ones. Calibration appears to improve with capability. Chapters * (00:00) - Introduction to AI and Software Development * (02:25) - Alibaba's Quinn 3.6 Model Overview * (08:06) - Anthropic's Claude Opus 4.7 Release * (18:08) - Cal.com Goes Closed Source: Implications for Security * (20:40) - The Future of Vibe Coding * (23:41) - Techniques for Effective AI Utilization * (27:13) - Post-Processing and AI in Real-World Applications * (33:07) - The Cultural Impact of AI and Technology * (41:30) - Navigating Code Review Challenges * (42:57) - Exploring AI's Societal Impact * (45:16) - Evaluating AI Models: Performance and Insights * (49:09) - The Future of Data Centers and AI * (50:54) - Investment Trends and Economic Perspectives * (57:58) - Reflections on Historical Investment Cycles * (59:35) - Optimism Amidst Uncertainty Resources mentioned Claude Opus 4.7 & Qwen 3.6 • Introducing Claude Opus 4.7 (Anthropic): https://www.anthropic.com/news/claude-opus-4-7 • Claude Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf • Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All: https://qwen.ai/blog?id=qwen3.6-35b-a3b • Simon Willison — Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/ • Shimin — Opus 4.7 isn't dumb, it's just lazy: https://shimin.io/journal/opus-4-7-just-lazy/ Security & open source • Cal.com is going closed source. Here's why: https://cal.com/blog/cal-com-goes-closed-source-why • Drew Breunig — Cybersecurity Looks Like Proof of Work Now: https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html Technique & commentary • Jesse Vincent — Rules and Gates: https://blog.fsck.com/2026/04/07/rules-and-gates/ • An AI Vibe Coding Horror Story: https://www.tobru.ch/an-ai-vibe-coding-horror-story/ • Kyle Kingsbury (Aphyr) — The Future of Everything is Lies, I Guess: https://aphyr.com/posts/411-the-future-of-everything-is-lies-i-guess Shimin's project • AI-on-AI Arena: https://shimin.io/ai-on-ai-arena Bubble watch • Ars Technica — Satellite and drone images reveal big delays in US data center construction: https://arstechnica.com/ai/2026/04/construction-delays-hit-40-of-us-data-centers-planned-for-2026/ • Epoch AI — OpenAI Stargate: where the US sites stand: https://epochai.substack.com/p/openai-stargate-where-the-us-sites • Paul Graham on US investment cycles (log scale): https://x.com/paulg/status/2045120274551423142/photo/1 About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. Co-hosts Shimin Zhang and Dan Lasky go through hundreds of links and dozens of newsletters every week so you don't have to. Recurring guest Rahul Yadav joins when he can. • Website: https://www.adipod.ai • Email: humans@adipod.ai New episodes every Friday. Follow the show to get them automatically.
26 episoder
Kommentarer
0Vær den første til at kommentere
Tilmeld dig nu og bliv en del af Artificial Developer Intelligence-fællesskabet!