Cover image of show The Private AI Lab

The Private AI Lab

Podcast by Johan van Amersfoort

English

Technology & science

Limited Offer

2 months for 19 kr.

Then 99 kr. / monthCancel anytime.

  • 20 hours of audiobooks / month
  • Podcasts only on Podimo
  • All free podcasts
Get Started

About The Private AI Lab

The Private AI Lab is a monthly podcast where we explore the future of Artificial Intelligence behind the firewall. Hosted by Johan from Johan.ml, each episode invites industry experts, innovators, and thought leaders to discuss how Private AI is reshaping enterprises, technology, and society. From data sovereignty to air-gapped deployments, from GPUs to governance — this podcast uncovers the real-world experiments, failures, and breakthroughs that define the era of Private AI. 🎙️ New episode every month. 🌐 More at Johan.ml

All episodes

16 episodes

episode 015 - Meet Sparky: A Real-Life Jarvis with Alexis Gallagher artwork

015 - Meet Sparky: A Real-Life Jarvis with Alexis Gallagher

I've been trying to build my own Jarvis for years. Then I met Alexis Gallagher at GTC — and Sparky is the closest thing I've seen. Alexis is an AI researcher and developer, formerly at Answer AI and Google, now building something most people in AI aren't: a robot designed not just to be useful, but to be *alive*. Sparky lives on his desk in San Francisco. He initiates conversations. He develops his own evolving interests — eels, catenary arches, abandoned infrastructure. He knows who's in the room, when to speak, and when to stay quiet. And he noticed when it was Alexis's first Friday after leaving his job. In this episode we go deep on the two design goals behind Sparky (useful and alive), the OpenClaw orchestration layer, the social awareness architecture running five times per second, the shared workspace principle that unlocks genuinely useful AI at a desk, and the tradeoffs between cascading and voice-to-voice architectures. We also do a live model switch mid-episode — from Claude Sonnet 4.6 to Nemotron 3 Super 120B running locally on a DGX Spark. It goes impressively well. Until it doesn't. That's in there too. Guest Alexis Gallagher — AI researcher and creator of Sparky 🌐 myrobotSparky.com 🔗 https://www.linkedin.com/in/alexis-gallagher/ Key topics covered - The two design goals: useful AND alive — and why "alive" is the one almost nobody builds for - How Sparky develops and evolves - The social awareness stack - What OpenClaw enables - The shared workspace principle - Cascading architecture (STT → LLM → TTS) vs voice-to-voice — the intelligence tradeoff - Hardware: Reachy Mini Lite, RTX 3090, DGX Spark, Raspberry Pi — the full spectrum - Live model switch: Claude Sonnet 4.6 → Nemotron 3 Super 120B (the Flowers for Algernon moment) - The future of personal AI — why embodied social presence is the natural human interface Chapters ``` 00:00 Introduction 00:39 Who is Alexis Gallagher? 01:04 The pivotal AI moment: speech recognition in 2015 03:14 Science fiction to reality — where are the talking robots? 04:22 Sparky introduces himself (live on air) 05:33 The two design goals: useful and alive 07:02 How Sparky initiates conversations — and why that changes everything 08:10 Organic interests: how Sparky evolves what he cares about 09:48 OpenClaw as orchestration layer — soul.md and body control 12:55 Defining a custom robot node type in OpenClaw 15:26 Social awareness: face detection, diarization, presence sensing 16:15 Hardware options: Linux, RTX 3090, DGX Spark, Raspberry Pi 18:25 The Reachy Mini Lite kit — and why it's better than building a drone 19:40 Where to find Alexis and join the Discord 20:10 One eye, four ears — Sparky's hardware explained 24:25 What OpenClaw enables that other frameworks don't 28:13 "Do you have a body, or are you a body?" — a live philosophical exchange 31:17 Live model switch: Claude Sonnet 4.6 → Nemotron 3 Super 33:01 The shared workspace principle — implicit shared attention 38:04 Orchestration in practice: Emacs, sub-agents, cross-platform 40:11 Cascading vs voice-to-voice architecture — the real tradeoff 42:15 Designing Sparky's voice (and the 1930s experiment) 44:12 What's genuinely useful day-to-day — two real examples 48:47 Nemotron 3 Super live — impressive, then the context window 53:38 The model Sparky was running before (Claude Sonnet 4.6) 54:03 Five years out: the future of personal AI companions 58:14 The closest thing to Jarvis I've ever seen 01:00:22 What's coming next — how fast the pieces are moving 01:02:16 Where to find Alexis and join the community ``` Links - Sparky project and Discord: https://myrobotSparky.com - Reachy Mini Lite: https://huggingface.co/reachy-mini The Private AI Lab is hosted by Johan van Amersfoort — Chief Evangelist and AI Lead at ITQ. 📬 Newsletter: https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7381951883810111489 📝 Blog: https://johan.ml 🔗 LinkedIn: https://www.linkedin.com/in/hojan

13 May 2026 - 1 h 4 min
episode 014 - Project Q9: Where Robotics and AI meet (with Sander Harrewijnen) artwork

014 - Project Q9: Where Robotics and AI meet (with Sander Harrewijnen)

In this episode, Johan is joined by long-time colleague Sander Hardewijnen to pull back the curtain on Project Q9 — an ambitious internal project at ITQ that combines a Unitree Go 2 Pro robotic dog, private AI, computer vision, and modern cloud-native development practices. From gesture recognition trained on 30,000 hand images to a Skynet-obsessed dog posting on LinkedIn, this episode is a deep dive into what happens when you give great engineers a suitcase full of robot and say, "see where it goes." The conversation also covers the state of open-source AI coding assistants (OpenClaw vs NemoClaw), the realities of vibe coding in a production context, and what partner platforms like Red Hat OpenShift AI and SUSE AI actually enable beyond conversational AI. Sander's blog: https://harre.dev Q9's LinkedIn page: https://www.linkedin.com/in/q9-the-dog-2206863b1/ Chapters 00:00 Welcome & Introduction01:20 Icebreaker: Best AI Fail02:12 NemoClaw vs OpenClaw: Security & Sandboxing04:49 Running OpenClaw in an Isolated VLAN05:32 OpenClaw as a Personal Assistant: Home Assistant, News & Efteling API09:11 OpenClaw in the ITQ WhatsApp Group11:10 Introducing Project Q913:22 Why Robotics + Cloud-Native + AI?16:16 Technical Anatomy of Q918:30 Partner Platform Showcase: Broadcom, Red Hat & SUSE19:20 Debunking the GPU Myth23:05 Building the Gesture Recognition Model25:00 Training Progression: Epochs, Accuracy & Landmarks30:21 Hand Landmark Detection & the Gesture Pipeline32:34 Crowd Reactions at KubeCon33:57 Fine-Tuning vs Training From Scratch36:16 Use Case 2: Q9's LLM-Powered LinkedIn Persona40:41 Running LLMs on Partner Inference Platforms42:26 What's Next for Q9?43:44 Digital Twins in NVIDIA Omniverse + ROS245:10 Key Takeaways48:53 Responsible Vibe Coding49:58 Open-Sourcing Q9 — Coming Soon

30 Apr 2026 - 50 min
episode 013 - AI Resource Management Update & Tools with Frank Denneman artwork

013 - AI Resource Management Update & Tools with Frank Denneman

In this episode of The Private AI Lab, Frank Denneman returns as the first recurring guest to go deeper into one of the most misunderstood challenges in AI: 👉 Resource management for GPU workloads Building on our previous conversation, this episode shifts from why it matters to how to actually design it right. We dive into real-world challenges like GPU fragmentation, siloed capacity, and why traditional infrastructure thinking breaks down when AI enters the data center. Frank shares practical insights from his latest research, blog series, and tools—helping architects and platform engineers understand how to design efficient, scalable AI environments. 🔍 What you’ll learn in this episode * Why GPU workloads behave fundamentally differently from CPU/memory workloads * What GPU fragmentation really is (and why it kills utilization) * The difference between same-size vs mixed-mode placement * How placement IDs turn GPU scheduling into “Tetris” * Why “right-sizing” beats “perfect fitting” in AI environments * How to design a GPU profile catalog that actually scales * The role of state, agents, and storage in next-gen AI platforms 🔧 Tools & Resources mentioned Frank created practical tools to help you design and validate your GPU environments: * 👉 vGPU Silo Capacity Calculator https://frankdenneman.ai/tools/vgpu-silo-capacity-calculator/ * 👉 Same-size vs Mixed-mode Placement Tool https://frankdenneman.ai/tools/same-size-vs-mixed-mode/ * 👉 Deep dive on unified memory & modern AI workloads https://frankdenneman.ai/posts/2026-03-23-understanding-unified-memory-dgx-spark-nemoclaw-nemotron/ Chapters: 00:00 Intro — Frank Denneman returns 01:30 AI hype vs real engineering 03:00 DGX Spark, NemoClaw & local AI agents 10:30 From LLMs to agents & stateful systems 12:00 Why AI infrastructure is different 15:00 What is GPU fragmentation? 19:30 Same-size vs mixed-mode placement 23:00 GPU “Tetris” and placement IDs explained 27:00 Right-sizing vs perfect fitting 32:00 The tools: capacity & placement simulation 36:00 GPU silos vs stranded capacity 41:00 Model sizing, KV cache & dynamic usage 48:00 Future of AI: smaller models & orchestration 55:00 AI-assisted coding & real-world impact 59:00 Key lessons learned 01:02:00 Closing thoughts

16 Apr 2026 - 1 h 3 min
episode 012 - From Sepsis to Sovereign Cloud: OpenShift AI in Healthcare (with Vincent Tsugranes) artwork

012 - From Sepsis to Sovereign Cloud: OpenShift AI in Healthcare (with Vincent Tsugranes)

AI in healthcare didn’t start with ChatGPT. Long before generative AI, hospitals were using machine learning for sepsis detection, imaging diagnostics, and predictive analytics. In this episode of The Private AI Lab, Johan sits down with Vincent Tsugranes, Chief Architect at Red Hat, to explore what’s real, what’s hype, and why platform matters more than ever. They discuss: * Why 95% of AI projects fail * The evolution from OpenShift Data Science to OpenShift AI * Models-as-a-Service inside hospitals * vLLM vs LLMD for large-scale inference * Guardrails, hallucinations, and enterprise risk * Sovereign cloud and why healthcare is moving on-prem again * What “ambient AI” might mean in the next 12 months This episode is for architects, platform engineers, healthcare IT leaders, and anyone building private AI in regulated environments. 00:00 – Red lights & farming with AI 02:10 – The first AI spark moment 04:00 – When “AI” became AI (ChatGPT moment) 07:20 – Why 95% of AI projects fail 11:00 – Machine learning vs modern AI 13:30 – Platform vs point solutions 16:00 – The history of OpenShift AI 19:00 – What is OpenShift AI under the hood? 22:00 – Hardware enablement & NVIDIA 25:00 – vLLM explained 27:30 – LLMD and distributed inference 30:00 – Healthcare use cases (sepsis, imaging, insurance) 33:00 – Models-as-a-Service inside hospitals 36:00 – Guardrails & hallucination risks 39:00 – Observability & FinOps explosion 42:00 – OpenShift 5 and platform intelligence 44:30 – Sovereign cloud in healthcare 48:00 – The future: ambient AI & rising power bills

2 Apr 2026 - 51 min
episode 011 - Open Source AI Just Leveled Up — Meet NVIDIA Nemotron Super artwork

011 - Open Source AI Just Leveled Up — Meet NVIDIA Nemotron Super

Recorded live at NVIDIA GTC 2026, this episode dives into one of the biggest announcements in open AI: Nemotron Super. Together with Joey Conway, we explore how NVIDIA is pushing open source AI forward — with models that are not only powerful, but also efficient and enterprise-ready. We discuss: * The evolution from Llama-based models to Nemotron * Why reasoning + agentic capabilities matter * How NVIDIA balances performance and efficiency * What NVFP4 means for running AI locally * And why this could be a turning point for AI behind the firewall Chapters 00:00 Intro 01:56 Welcome 02:37 GTC insights 03:31 Nemotron buzz 04:53 Model evolution 07:14 Core design principles 09:05 Reasoning capabilities 10:52 Scaling challenges 12:00 Architecture deep dive 13:12 Performance improvements 14:14 Quantization strategy 15:39 NVFP4 explained 16:16 DGX Spark use case 18:23 Broader adoption 19:37 Agentic AI impact 21:25 Try it yourself 22:03 Outro Links * Try Nemotron: https://build.nvidia.com * More episodes: https://johan.ml

26 Mar 2026 - 20 min
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍
Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription

Most popular

Limited Offer

Premium

20 hours of audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

2 months for 19 kr.
Then 99 kr. / month

Get Started

Premium Plus

Unlimited audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

Start 7 days free trial
Then 129 kr. / month

Start for free

Only on Podimo

Popular audiobooks

Get Started

2 months for 19 kr. Then 99 kr. / month. Cancel anytime.