AI Odyssey

Podkast av Anlie Arnaudy, Daniel Herbera and Guillaume Fournier

engelsk

Teknologi og vitenskap

Prøv gratis i 14 dager

99 kr / Måned etter prøveperioden.Avslutt når som helst.

20 timer lydbøker i måneden
Eksklusive podkaster
Gratis podkaster

Prøv gratis

Les mer AI Odyssey

AI Odyssey is your journey through the vast and evolving world of artificial intelligence. Powered by AI, this podcast breaks down both the foundational concepts and the cutting-edge developments in the field. Whether you're just starting to explore the role of AI in our world or you're a seasoned expert looking for deeper insights, AI Odyssey offers something for everyone. From AI ethics to machine learning intricacies, each episode is crafted to inspire curiosity and spark discussion on how artificial intelligence is shaping our future.

Alle episoder

80 Episoder

Your Best Colleague Is Now a Skill

What if an AI agent could preserve a colleague’s judgment without pretending to become that person? COLLEAGUE.SKILL turns chats, documents, emails, screenshots, and other traces into inspectable agent skills: portable folders of instructions, examples, metadata, and correction history. The key idea is expert knowledge distillation : the extraction of useful human expertise into a bounded technical artifact. For enterprises, this points to a new operating model. Scarce expertise can become reusable, auditable, and updateable, but only if provenance, consent, and limits remain visible. Inspired by the work of Tianyi Zhou, Dongrui Liu, Leitao Yuan, Jing Shao, and Xia Hu, this episode was created using Google's NotebookLM. Read the original paper : https://arxiv.org/abs/2605.31264

7. juni 2026 - 19 min

AI Agents Just Learned to Train Their Own Skills

What if the next leap in AI agents is not a bigger model, but a skill document that learns from failure? SkillOpt treats agent skills as trainable external memory: a separate optimizer edits a compact procedure, then keeps only changes that improve held-out validation, meaning tests not used for the edit. Across 52 model, benchmark, and harness settings, the method is best or tied every time, with gains above 20 points on GPT-5.5 in several loops. For enterprises, this points to a new layer of governance: skills that improve, transfer, and remain auditable. Inspired by the work of Yifan Yang, Ziyang Gong, Weiquan Huang, Qihao Yang, Ziwei Zhou, Zisu Huang, Yan Li, Xuemei Gao, Qi Dai, Bei Liu, Kai Qiu, Yuqing Yang, Dongdong Chen, Xue Yang, Chong Luo, this episode was created using Google's NotebookLM. Read the original paper here: https://arxiv.org/abs/2605.23904

31. mai 2026 - 22 min

AI Agents Fail the Spreadsheet Test

What happens when AI agents are asked to build the spreadsheets finance teams actually use? WorkstreamBench, a benchmark for end-to-end financial spreadsheet work, exposes the gap between impressive demos and professional deliverables. It tests complete multi-sheet workbooks, not single formulas or table questions. The benchmark scores accuracy, formula quality, and formatting, because in finance a model must be auditable, readable, and easy to modify. Claude Web leads with 69.1 out of 100, but even the best systems degrade as tasks become more complex. Enterprise AI still has a spreadsheet reliability problem. Inspired by the work of Thomson Yen, Julian Poeltl, Harshith Srinivas Gear, Yilin Meng, Joshua Fan, Adam Shen, Yili Liu, Ali Bauyrzhan, Siri Du, Haoyang Liu, Daniel Guetta, and Hongseok Namkoong, this episode was created using Google's NotebookLM. Read the original paper here: https://arxiv.org/pdf/2605.22664

25. mai 2026 - 23 min

Hermes Agent and the Rise of Agentic Operating Systems

Every forty years, the way we touch a computer changes shape. The command line gave way to the mouse. The mouse gave way to the touchscreen. And now, quietly, the screen itself is starting to disappear. In this episode, we follow Hermes, an open-source agentic operating system that hit number one on OpenRouter in ninety days, processing 224 billion tokens a day. Persistent memory, self-written skills, local-first execution: Hermes is not an app you launch, it is a digital coworker that launches things for you. And while the text interface collapses into orchestration, the voice interface is collapsing into presence: Mira Murati's Thinking Machines Lab just unveiled "interaction models" that listen, watch, and speak at the same time, in 200-millisecond micro-turns. Two paradigm shifts, one direction. The OS becomes the agent. The agent becomes the conversation. Inspired by recent research on Agentic Operating Systems, this episode was created using Google's NotebookLM.

16. mai 2026 - 15 min

The Agent Question Nobody Asked: When Should AI Interrupt You?

Most people assume an AI agent should ask for clarification as early as possible. This paper shows that the truth is more subtle. For long-horizon agents — AI systems that execute many steps over time — the value of a clarification depends on what is missing : goal, input, constraint, or context. Some answers lose value almost immediately. Others remain useful much later. For enterprises, this is not a UX detail. It is a governance problem : when should an agent stop, ask, and avoid compounding a bad assumption? Inspired by the work of Anmol Gulati, Hariom Gupta, Elias Lumer, Sahil Sen, and Vamse Kumar Subbiah, this episode was created using Google's NotebookLM. Read the original paper here : https://arxiv.org/abs/2605.07937v1 [https://arxiv.org/abs/2605.07937v1]

14. mai 2026 - 18 min

Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen

Liker at det er både Podcaster (godt utvalg) og lydbøker i samme app, pluss at man kan holde Podcaster og lydbøker atskilt i biblioteket.

Bra app. Oversiktlig og ryddig. MYE bra innhold⭐️⭐️⭐️

Velg abonnementet ditt

Mest populær

Premium

20 timer lydbøker