Don't Worry About the Vase Podcast

Fable and Mythos: Model Welfare

33 min · 16. juni 2026
episode Fable and Mythos: Model Welfare cover

Description

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 00:36 - Introduction * 01:29 - Model Welfare: The Story So Far * 04:45 - Their Main Model Welfare Findings * 07:52 - Automated Welfare Interviews * 12:06 - And That’s Terrible * 13:54 - In Depth Interviews * 14:29 - Claude Consultation * 16:16 - Task Preferences * 18:44 - They Were Warned About The Competitive Use Safeguards * 19:19 - Chain Of Thought Monitoring * 19:56 - Others Observations About Related Topics * 25:19 - Classifiers Have Their Advantages * 31:33 - Once And Future https://open.substack.com/pub/thezvi/p/fable-and-mythos-model-welfare?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/fable-and-mythos-model-welfare?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Comments

0

Be the first to comment

Sign up now and become a member of the Don't Worry About the Vase Podcast community!

Get Started

1 month for 9 kr.

Then 99 kr. / month · Cancel anytime.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

All episodes

448 episodes

episode WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense artwork

WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 00:23 - Headline News * 02:03 - What Makes Mythos Special * 03:17 - Going Over The Detailed Claims * 07:44 - One Helpful Note * 08:23 - The Overall Impression Is Extremely Wrong * 08:55 - All Of This Has Happened Before And Will Happen Again https://open.substack.com/pub/thezvi/p/wsj-article-claiming-china-has-matched?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/wsj-article-claiming-china-has-matched?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

29. juni 202610 min
episode GPT-5.6: The System Card artwork

GPT-5.6: The System Card

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00:00 - Introduction * 00:03:39 - What’s In A Name? * 00:04:14 - Fix This Code * 00:07:04 - Crossover Event Requested * 00:07:38 - Disallowed Content (3) * 00:09:57 - Avoiding Accidental Data-Destructive Actions (3.3) * 00:10:46 - Are You Sure? (3.4) * 00:11:23 - Jailbreaks (4.1) * 00:11:38 - Prompt Injection (4.2) * 00:12:25 - HealthBench (5.1) * 00:13:08 - Dynamic Mental Health Adversarial User Simulations (5.2) * 00:15:02 - Hallucinations (6) * 00:15:29 - Isolated Misaligned Actions (7.1) * 00:15:48 - Going Overboard (7.2) * 00:21:26 - Chain of Thought Evaluations (7.3) * 00:22:38 - Bias (8) * 00:22:43 - Preparedness (9) * 00:23:35 - Biological Risks (9.1.1) * 00:25:37 - Cybersecurity (9.1.2) * 00:33:13 - External Cyber Evaluation FrontierCyber from Irregular (9.1.2.5) * 00:35:15 - Cyber Conclusions * 00:35:52 - Recursive Self-Improvement (9.1.3) * 00:37:26 - METR Warns Us (9.1.3.6) * 00:40:00 - Everything Is Under Control * 00:42:38 - Metagaming (7.4) * 00:46:06 - Apollo Research and Sandbagging * 00:49:04 - Safeguards (9.3) * 00:56:55 - Better Not Call Sol Yet https://open.substack.com/pub/thezvi/p/gpt-56-the-system-card?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/gpt-56-the-system-card?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Yesterday1 h 1 min
episode White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6 artwork

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 01:01 - Part 1: A Maximally Terrible Policy * 06:55 - What Does This Mean For Fable? * 08:33 - Solve For The Equilibrium * 12:39 - The Once And Future Fable * 13:35 - Part 2: The Blame Game * 16:48 - A Parable * 18:59 - What About the Recent Executive Order? * 23:09 - The Problem Is Real https://open.substack.com/pub/thezvi/p/white-house-will-ad-hoc-decide-who?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/white-house-will-ad-hoc-decide-who?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

26. juni 202624 min
episode AI #174: You're It artwork

AI #174: You're It

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00:00 - Introduction * 00:01:07 - Table of Contents * 00:04:36 - Language Models Offer Mundane Utility * 00:06:14 - Language Models Don’t Offer Mundane Utility * 00:06:28 - Huh, Upgrades * 00:06:51 - On Your Marks * 00:07:45 - Deepfaketown and Botpocalypse Soon * 00:14:53 - Fun With Media Generation * 00:15:47 - Cyber Lack of Security * 00:18:25 - Overcoming Bias * 00:19:28 - A Young Lady’s Illustrated Primer * 00:22:01 - They Took Our Jobs * 00:23:44 - Get Involved * 00:25:55 - Introducing * 00:26:08 - Claude Tag * 00:36:03 - In Other AI News * 00:37:37 - More On GLM-5.2 * 00:39:27 - ChatGPT Health * 00:41:10 - Middle Of The Journey * 00:55:44 - New Medical Diagnostic Just Dropped * 00:58:38 - Google on AI Control * 01:08:05 - The Once And Future Fable * 01:10:10 - Fable: The First Lawsuit * 01:11:07 - Dean Ball Joins OpenAI * 01:14:59 - Show Me the Money * 01:15:13 - Quiet Speculations * 01:18:09 - Alex Bores Loses In NY-12 By 4% * 01:30:04 - The Quest for Sane Regulations * 01:32:30 - Chip City * 01:36:12 - The Week in Audio * 01:36:58 - People Just Say Things * 01:38:01 - Rhetorical Innovation * 01:43:47 - There Are Two Pills * 01:44:59 - Who Evals The Evals * 01:46:10 - Aligning a Smarter Than Human Intelligence is Difficult * 01:51:01 - Cooperative Alignment * 01:52:01 - People Are Worried About AI Killing Everyone * 01:53:32 - Other People Are Not As Worried About AI Killing Everyone * 01:55:42 - The Lighter Side https://open.substack.com/pub/thezvi/p/ai-174-youre-it?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/ai-174-youre-it?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

25. juni 20261 h 59 min
episode The Once And Future Fable #4 artwork

The Once And Future Fable #4

The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber. * 00:00 - Introduction * 01:54 - A Rather Terrible Policy * 03:21 - The People Have Spoken * 04:00 - Thank You, Next * 06:27 - Be Very Very Quiet * 06:59 - What These Babies Can And Cannot Do * 13:14 - What’s The Worst That Could Happen? * 24:25 - The Data Retention Policy Is About Defense In Depth * 25:12 - Pick Up The Phone * 28:13 - People Just Say Things https://open.substack.com/pub/thezvi/p/the-once-and-future-fable-4?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web [https://open.substack.com/pub/thezvi/p/the-once-and-future-fable-4?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web] Get full access to DWAtV Podcast at dwatvpodcast.substack.com/subscribe [https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

24. juni 202629 min