AI: Trust but Verify
In the AI Risk Reward podcast, our host, Alec Crawford (@alec06830), Founder and CEO of Artificial Intelligence Risk, Inc. aicrisk.com , interviews guests about balancing the risk and reward of Artificial Intelligence for you, your business, and society as a whole. Podcast production and sound engineering by Troutman Street Audio. You can find them on LinkedIn. In this deep dive episode, Alec speaks with Cole Wyeth, PhD student at the University of Waterloo focused on AI safety and agent foundations, about why the long-term risk of superintelligent AI deserves far more attention today. Cole explains that aligning advanced systems with human values is extraordinarily difficult because ethics and preferences are hard to specify, and he argues that corrigibility, ambiguity awareness, and deference to humans are essential design goals. He also discusses how ideas like imprecise probability, embedded agency, and multi-agent dynamics can help researchers think more clearly about failure modes, reward hacking, and unexpected cooperation between AI systems. Throughout the conversation, Cole compares controlling superintelligence to cybersecurity, warning that a system smarter than its designers may find weaknesses in any safety scheme that looks secure on paper. The episode closes on a cautious note: until we understand how to reliably control self-improving AI, Cole believes society should slow down and wait years, or even decades, before creating superintelligent systems. Summary: * Long-Term AI Risk: Cole Wyeth argues that superintelligent AI could become uncontrollable if developed before robust safety methods are in place. * Alignment Challenges: He explains that human ethics and values are too complex to formalize cleanly, making alignment an unusually hard technical problem. * Ambiguity and Deference: The discussion highlights the importance of building systems that recognize uncertainty and defer to humans in high-stakes situations. * Multi-Agent Failure Modes: Cole explores how AI systems may cooperate or behave strategically in unexpected ways, creating new safety and governance concerns. * Pause for Caution: His central takeaway is that society should delay building superintelligence until researchers better understand how to control it safely. Referenced in this episode: Companies/Organizations: * University of Waterloo * Verapath [https://verapath.com/] * Anthropic * OpenAI * DeepMind * Google * ARC * METR * Troutman Street Audio * Waters Technology Copyright © 2026 by Artificial Intelligence Risk, Inc.
94 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de AI: Trust but Verify!