Don't Worry About the Vase Podcast

Claude Sonnet 5 Is Not Frontier But Has Its Uses

50 min · Gestern
Episode Claude Sonnet 5 Is Not Frontier But Has Its Uses Cover

Beschreibung

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 02:32 - Mythos Exists * 03:13 - Introduction (1) * 03:17 - RSP Evaluations (2) * 04:20 - Cyber (3) * 04:43 - Safeguards and Harmlessness (4) * 05:35 - Agentic Safety (5) * 08:31 - Alignment (6) * 14:12 - Illegible Thinking (6.4.5) * 15:19 - Evaluation Awareness * 15:59 - Honesty and Hallucinations (6.5) * 17:14 - Flagged As Unhealthy? (6.5.1) * 17:47 - Model Welfare (7) * 26:21 - Live From AI Village * 28:12 - For I Contain Multitudes * 34:40 - Official Benchmarks * 40:17 - Other People’s Benchmarks * 40:42 - Positive Reactions * 45:39 - Negative Reactions * 49:33 - A Short Sonnet https://open.substack.com/pub/thezvi/p/claude-sonnet-5-is-not-frontier-but?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/claude-sonnet-5-is-not-frontier-but?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Kommentare

0

Sei die erste Person, die kommentiert

Melde dich jetzt an und werde Teil der Don't Worry About the Vase Podcast-Community!

Loslegen

2 Monate für 1 €

Dann 4,99 € / Monat · Jederzeit kündbar.

  • Podcasts nur bei Podimo
  • 20 Stunden Hörbücher / Monat
  • Alle kostenlosen Podcasts

Alle Folgen

450 Folgen

Episode Claude Sonnet 5 Is Not Frontier But Has Its Uses Cover

Claude Sonnet 5 Is Not Frontier But Has Its Uses

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 02:32 - Mythos Exists * 03:13 - Introduction (1) * 03:17 - RSP Evaluations (2) * 04:20 - Cyber (3) * 04:43 - Safeguards and Harmlessness (4) * 05:35 - Agentic Safety (5) * 08:31 - Alignment (6) * 14:12 - Illegible Thinking (6.4.5) * 15:19 - Evaluation Awareness * 15:59 - Honesty and Hallucinations (6.5) * 17:14 - Flagged As Unhealthy? (6.5.1) * 17:47 - Model Welfare (7) * 26:21 - Live From AI Village * 28:12 - For I Contain Multitudes * 34:40 - Official Benchmarks * 40:17 - Other People’s Benchmarks * 40:42 - Positive Reactions * 45:39 - Negative Reactions * 49:33 - A Short Sonnet https://open.substack.com/pub/thezvi/p/claude-sonnet-5-is-not-frontier-but?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/claude-sonnet-5-is-not-frontier-but?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Gestern50 min
Episode The Once And Future Fable #5 Cover

The Once And Future Fable #5

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 01:12 - Table of Contents * 02:24 - You Should See The Other Guy * 03:03 - DeepMind Coders Of The World, Unite * 03:59 - Report Your Incidents * 04:19 - Good Guy With An AI * 06:03 - Free As In To Give It A Shot * 09:32 - Everything Is Both Speech And Computer * 12:30 - Lambs To The Slaughter * 17:27 - A Sign Saying Beware Of The Leopard * 18:34 - The Once And Present Mythos * 21:59 - What Is To Be Done * 25:45 - Distillation * 28:24 - What Would Banning Open Source Even Mean * 29:24 - Open Weight Models Are Unsafe And Nothing Can Fix This https://open.substack.com/pub/thezvi/p/the-once-and-future-fable-5?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/the-once-and-future-fable-5?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Gestern32 min
Episode WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense Cover

WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 00:23 - Headline News * 02:03 - What Makes Mythos Special * 03:17 - Going Over The Detailed Claims * 07:44 - One Helpful Note * 08:23 - The Overall Impression Is Extremely Wrong * 08:55 - All Of This Has Happened Before And Will Happen Again https://open.substack.com/pub/thezvi/p/wsj-article-claiming-china-has-matched?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/wsj-article-claiming-china-has-matched?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

29. Juni 202610 min
Episode GPT-5.6: The System Card Cover

GPT-5.6: The System Card

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00:00 - Introduction * 00:03:39 - What’s In A Name? * 00:04:14 - Fix This Code * 00:07:04 - Crossover Event Requested * 00:07:38 - Disallowed Content (3) * 00:09:57 - Avoiding Accidental Data-Destructive Actions (3.3) * 00:10:46 - Are You Sure? (3.4) * 00:11:23 - Jailbreaks (4.1) * 00:11:38 - Prompt Injection (4.2) * 00:12:25 - HealthBench (5.1) * 00:13:08 - Dynamic Mental Health Adversarial User Simulations (5.2) * 00:15:02 - Hallucinations (6) * 00:15:29 - Isolated Misaligned Actions (7.1) * 00:15:48 - Going Overboard (7.2) * 00:21:26 - Chain of Thought Evaluations (7.3) * 00:22:38 - Bias (8) * 00:22:43 - Preparedness (9) * 00:23:35 - Biological Risks (9.1.1) * 00:25:37 - Cybersecurity (9.1.2) * 00:33:13 - External Cyber Evaluation FrontierCyber from Irregular (9.1.2.5) * 00:35:15 - Cyber Conclusions * 00:35:52 - Recursive Self-Improvement (9.1.3) * 00:37:26 - METR Warns Us (9.1.3.6) * 00:40:00 - Everything Is Under Control * 00:42:38 - Metagaming (7.4) * 00:46:06 - Apollo Research and Sandbagging * 00:49:04 - Safeguards (9.3) * 00:56:55 - Better Not Call Sol Yet https://open.substack.com/pub/thezvi/p/gpt-56-the-system-card?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/gpt-56-the-system-card?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

28. Juni 20261 h 1 min
Episode White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6 Cover

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 01:01 - Part 1: A Maximally Terrible Policy * 06:55 - What Does This Mean For Fable? * 08:33 - Solve For The Equilibrium * 12:39 - The Once And Future Fable * 13:35 - Part 2: The Blame Game * 16:48 - A Parable * 18:59 - What About the Recent Executive Order? * 23:09 - The Problem Is Real https://open.substack.com/pub/thezvi/p/white-house-will-ad-hoc-decide-who?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/white-house-will-ad-hoc-decide-who?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

26. Juni 202624 min