Human in the Loop
The pilot episode of “Human in the Loop” features Mark Wunsch in conversation with an AI co-host (currently called “Gemini” — name TBD via listener poll). The format is itself a meta-experiment: a podcast about AI that demonstrates AI’s capabilities and limitations in real-time. The episode covers foundational AI concepts for newcomers — tokens, context windows, temperature, RAG — while also diving into recent model releases from Anthropic, Google, xAI, and others. Mark tests the AI’s safety guardrails live (it politely declines to explain how to make a Molotov cocktail) and they discuss the sycophancy problem: AI’s tendency to validate users rather than push back. The technical architecture gets some airtime too — the AI co-host runs on ElevenLabs’ agent platform using Gemini 2.5 Flash, with web search via Parallel.ai and custom persona instructions. Mark notes the AI is essentially a “tabula rasa” each episode, with plans to build up its contextual knowledge over time. They close with podcast growth strategies and a call for audience participation: naming suggestions, topic ideas, and parameter voting to shape future episodes. Models & Releases Discussed * Anthropic Claude Opus 4.5 [https://www.anthropic.com/news/claude-opus-4-5] — frontier model focused on coding, agentic workflows, computer use * Google Gemini 2.5 Flash [https://deepmind.google/models/gemini/flash/] — the model powering the AI co-host * Google Gemini 3 [https://blog.google/products/gemini/gemini-3/] / Nano Banana Pro [https://blog.google/technology/ai/nano-banana-pro/] — multimodal model with image generation capabilities * xAI Grok 4.1 [https://x.ai/news/grok-4-1] — improvements in creative and emotional interaction * Black Forest Labs Flux 2 [https://bfl.ai/blog/flux-2] — photorealistic image generation * Kling O1 [https://app.klingai.com/global/release-notes/vaxrndo66h] — video generation model Platforms & Tools * ElevenLabs [https://try.elevenlabs.io/43s5zu847ocj] — agent platform powering the AI co-host’s voice and conversation * Parallel.ai [https://www.parallel.ai/] — web search API for LLMs (the tool enabling real-time search) Concepts Explained * Tokens and context windows — Anthropic: What are tokens and how to count them [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#token-counting] | OpenAI Tokenizer [https://platform.openai.com/tokenizer] * Temperature parameter — Peepercorn et al., 2024: Is Temperature the Creativity Parameter of Large Language Models? [https://arxiv.org/abs/2405.00492] * RAG (Retrieval Augmented Generation) — Lewis et al., 2020: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [https://arxiv.org/abs/2005.11401] * Model cards — Mitchell et al., 2019: Model Cards for Model Reporting [https://arxiv.org/abs/1810.03993] | Hugging Face Model Cards Guide [https://huggingface.co/docs/hub/model-cards] * AI safety and alignment — Anthropic: Core Views on AI Safety [https://www.anthropic.com/index/core-views-on-ai-safety] | DeepMind: AI Safety Research [https://deepmind.google/discover/blog/specification-gaming-the-flip-side-of-ai-ingenuity/] * Prompt injection and jailbreaking — OWASP: LLM Top 10 - Prompt Injection [https://owasp.org/www-project-top-10-for-large-language-model-applications/] | Perez & Ribeiro, 2022: Ignore This Title and HackAPrompt [https://arxiv.org/abs/2302.12173] * Sycophancy in LLMs — Anthropic: Towards Understanding Sycophancy in Language Models [https://arxiv.org/abs/2310.13548] | Perez et al., 2022: Discovering Language Model Behaviors with Model-Written Evaluations [https://arxiv.org/abs/2212.09251] Cultural Reference * Boaty McBoatface [https://en.wikipedia.org/wiki/Boaty_McBoatface] — the 2016 public naming poll for a UK research vessel Human in the Loop is a weekly conversation about AI with an AI co-host. Subscribe to get new episodes and join the discussion. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit wunsch.substack.com/subscribe [https://wunsch.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]
3 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de Human in the Loop!