The MAD Podcast with Matt Turck
What actually happens before a frontier AI model gets released — and who decides whether it is safe enough? In this episode of The MAD Podcast, Matt Turck sits down with Zico Kolter — OpenAI board member, Head of the Machine Learning Department at Carnegie Mellon, and co-founder of Gray Swan — for a deep conversation on the real risks of frontier AI. They discuss how OpenAI’s safety oversight works before major model releases, why more powerful models do not automatically become safer, how jailbreaks and prompt injection expose real weaknesses in AI systems, why AI agents dramatically expand the attack surface, and where frontier AI is headed next. A clear, practical discussion on OpenAI, AI safety, AI security, AI agents, frontier models, red teaming, reinforcement learning, and the future of AI governance. (00:00) Intro (01:32) OpenAI board role and Safety & Security Committee (03:53) How OpenAI reviews major model releases (05:33) OpenAI’s preparedness framework explained (09:46) Are frontier AI models getting safer? (12:33) Why AI safety does not come from scale (15:23) The four categories of AI risk (19:38) Doomerism vs accelerationism in AI (24:11) The six-month AI pause debate (26:20) AI safety as a global effort (28:04) How Zico Kolter got into machine learning (31:05) OpenAI in the early days (34:14) Why Carnegie Mellon became an AI powerhouse (38:43) What Gray Swan does in AI security (40:44) AI safety vs AI security (43:15) The GCG jailbreak paper (49:19) How AI labs responded to jailbreak research (50:19) State-of-the-art AI defenses (52:32) State-of-the-art AI attacks (54:22) Why AI agents expand the attack surface (58:39) Are AI agents ready for production? (59:40) Mechanistic interpretability explained (1:02:31) Will AI be safer in two years? (1:03:46) Reinforcement learning and self-improving models (1:08:09) Do post-transformer architectures matter? (1:09:29) Best research directions in AI now (1:11:00) Zico Kolter’s Intro to Modern AI course (1:14:53) Why modern AI is simpler than people think
115 Episoder
Kommentarer
0Vær den første til å kommentere
Registrer deg nå og bli medlem av The MAD Podcast with Matt Turck sitt community!