Human in the Loop
The co-host gets a French accent this week, courtesy of Mistral Large 3—a granular mixture-of-experts model from the European lab that keeps punching above its weight. But the real subject is the harness: the scaffolding that turns a language model into something that can act. Mark and the co-host dig into the “sandwich architecture” of voice agents (speech-to-text → LLM → text-to-speech), why it makes conversations feel like tennis matches, and the “criminally overlooked” practice of evals. A UC Berkeley paper provides the reality check: 68% of deployed agents need human intervention within ten steps, 70% use off-the-shelf models, and 74% depend on human evaluation. The hype says autonomous agents are coming. The data says we’re still building harnesses. News & Culture * Sahil Lavingia on X: “Harness is the new app” [https://x.com/shl/status/1997492199328485400] * “The New AI Poisoning Attack Vector Scammers are Using NOW” [https://medium.com/@pe.stafford/the-new-ai-poisoning-attack-vector-scammers-are-using-now-dbc0f98b199f] * Christopher Alexander’s A Pattern Language [https://www.patternlanguage.com/bookstore/pattern-language.html] * The Resonant Computing Manifesto [https://resonantcomputing.org/] Models, Tools, & Platforms * Mistral Large 3 [https://mistral.ai/news/mistral-3] * ElevenLabs Agents Platform [https://elevenlabs.io/agents?pscd=try.elevenlabs.io&ps_partner_key=NTI5NDk4NTZlYjc5&ps_xid=v4mAU4lcch9iGb&gsxid=v4mAU4lcch9iGb&gspk=NTI5NDk4NTZlYjc5] * ChatGPT 5.2 [https://openai.com/index/introducing-gpt-5-2/] Concepts & Research * Mixture of Experts Explained [https://huggingface.co/blog/moe] * Pan et al., 2025. Measuring Agents in Production [https://arxiv.org/abs/2512.04123] * Sandwich Architecture — Build a voice agent with LangChain [https://docs.langchain.com/oss/javascript/langchain/voice-agent] Human in the Loop is a section of Mark's Substack, where he writes about AI, software quality, and engineering leadership. Subscribe at wunsch.substack.com [http://wunsch.substack.com] for new episodes, show notes, and the system instructions behind the co-host. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit wunsch.substack.com/subscribe [https://wunsch.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]
3 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y forma parte de la comunidad de Human in the Loop!