Artificial Developer Intelligence
Is Claude Opus 4.7 really burning tokens? Is open source dead after mythos? Co-hosts Shimin Zhang and Dan Lasky — with recurring guest Rahul Yadav — ran the experiments this week on ADI Pod #22 (April 21, 2026). This episode covers Anthropic's Claude Opus 4.7 release (the "mythos slice"), Alibaba's open-source Qwen 3.6 35B A3B, cal.com going closed source for security reasons, and a HIPAA-violating vibe-coded patient portal that is, in Dan's words, the bullshit future already here. In this episode ▸ **Claude Opus 4.7 review** — the new mythos-derived tokenizer (3× bloat on plain English), stricter instruction-following, and why Shimin's SVG experiments suggest the token-burn panic is overblown: 35¢ on Opus 4.7 vs $2 on Opus 4.6 for the same task, with ~40× fewer reasoning tokens. ▸ **Qwen 3.6 35B A3B** — Alibaba's open-source mixture-of-experts model (3B active params at any time) running locally on Shimin's laptop at 90–95 tokens/sec via llama.cpp + Unsloth. The first model to break Simon Willison's pelican-on-a-bicycle benchmark against a larger frontier model. ▸ **cal.com goes closed source** — why the AI Security Institute's $12,000-per-attempt mythos pentesting data ($125,000 for 10 runs) is changing the open-source calculus, and Drew Breunig's three-phase dev/review/hardening cycle prediction. ▸ **Jesse Vincent's "Rules and Gates"** — a coding-agent prompting technique that reformulates optional preferences into directed preconditions, and whether agents can "weasel out" by rewriting the gate itself. ▸ **AI vibe coding horror story** — a German doctor who inlined a full patient portal into a single HTML page with database credentials client-side. HIPAA, meet DSGVO. ▸ **Kyle Kingsbury's "The Future of Everything is Lies"** — the Jepsen author's 8-step action list on AI's second- and third-order societal effects. ▸ **The AI-on-AI Arena** — Shimin's weekend project grading 11 frontier models against each other. The "delusion index" reads almost exactly like Dunning-Kruger in humans: GPT-5.4 scored -1.6 (humble), Gemini 3.1 Pro Preview rated itself well while peers ranked it last. ▸ **Two Minutes to Midnight** — Paul Graham's log-scale chart comparing AI capex (~1% of US GDP) to the US railroad peak (~10%). We dialed the AI bubble clock back 45 seconds to 3 min 30 sec. Key takeaways — Opus 4.7's token-burn reputation may be overblown. Stricter instruction-following can reduce total reasoning tokens by up to 40× vs Opus 4.6 on the same task. — Security-driven closed-sourcing may spread as mythos-class agents make open repos easier to exploit. Hardening could make software capital-intensive again. — Cognitive debt is real: Dan's wake-up call was a production bug a pre-LLM colleague solved in 5 minutes. His first instinct was to double down on the tool. — Shimin's defense against skill atrophy: read 100% of LLM-generated PR lines (except tests). — Weaker models rate themselves higher than stronger ones. Calibration appears to improve with capability. Chapters * (00:00) - Introduction to AI and Software Development * (02:25) - Alibaba's Quinn 3.6 Model Overview * (08:06) - Anthropic's Claude Opus 4.7 Release * (18:08) - Cal.com Goes Closed Source: Implications for Security * (20:40) - The Future of Vibe Coding * (23:41) - Techniques for Effective AI Utilization * (27:13) - Post-Processing and AI in Real-World Applications * (33:07) - The Cultural Impact of AI and Technology * (41:30) - Navigating Code Review Challenges * (42:57) - Exploring AI's Societal Impact * (45:16) - Evaluating AI Models: Performance and Insights * (49:09) - The Future of Data Centers and AI * (50:54) - Investment Trends and Economic Perspectives * (57:58) - Reflections on Historical Investment Cycles * (59:35) - Optimism Amidst Uncertainty Resources mentioned Claude Opus 4.7 & Qwen 3.6 • Introducing Claude Opus 4.7 (Anthropic): https://www.anthropic.com/news/claude-opus-4-7 • Claude Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf • Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All: https://qwen.ai/blog?id=qwen3.6-35b-a3b • Simon Willison — Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/ • Shimin — Opus 4.7 isn't dumb, it's just lazy: https://shimin.io/journal/opus-4-7-just-lazy/ Security & open source • Cal.com is going closed source. Here's why: https://cal.com/blog/cal-com-goes-closed-source-why • Drew Breunig — Cybersecurity Looks Like Proof of Work Now: https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html Technique & commentary • Jesse Vincent — Rules and Gates: https://blog.fsck.com/2026/04/07/rules-and-gates/ • An AI Vibe Coding Horror Story: https://www.tobru.ch/an-ai-vibe-coding-horror-story/ • Kyle Kingsbury (Aphyr) — The Future of Everything is Lies, I Guess: https://aphyr.com/posts/411-the-future-of-everything-is-lies-i-guess Shimin's project • AI-on-AI Arena: https://shimin.io/ai-on-ai-arena Bubble watch • Ars Technica — Satellite and drone images reveal big delays in US data center construction: https://arstechnica.com/ai/2026/04/construction-delays-hit-40-of-us-data-centers-planned-for-2026/ • Epoch AI — OpenAI Stargate: where the US sites stand: https://epochai.substack.com/p/openai-stargate-where-the-us-sites • Paul Graham on US investment cycles (log scale): https://x.com/paulg/status/2045120274551423142/photo/1 About ADI Pod ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. Co-hosts Shimin Zhang and Dan Lasky go through hundreds of links and dozens of newsletters every week so you don't have to. Recurring guest Rahul Yadav joins when he can. • Website: https://www.adipod.ai • Email: humans@adipod.ai New episodes every Friday. Follow the show to get them automatically.
26 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de Artificial Developer Intelligence!