“Dating Roundup #12: Sex and Violence” by Zvi

“Claude Fable 5 and Mythos 5: The System Card” by Zvi

First things first: Claude Fable 5 is the new best publicly available model. I have noticed a step change, where Fable can suddenly help me in ways that previous models were not worth bothering to query. Almost everything it has noticed in one of my drafts so far has been spot on and it is downright scary. Suddenly I am motivated to once again continue improving my Chrome extension. I only ask for things I actually want or am curious about, and it has nailed every question I have asked it. That does not mean it is the right tool for every job. There are four good reasons to often not use Fable. 1. Speed and price. Fable is importantly slower and more expensive than Opus 4.8, and often you will not need to make this trade. After the 22nd, when Fable may no longer be included in subscription plans if demand is too high, we may have to all pay by the token outside our subscriptions (although I suspect subscribers will get at least some credits to help with this), which could add up fast. 2. Relative strengths. Capabilities are jagged. There will still [...] --- Outline: (02:05) Another Week Another Giant System Card (03:02) How To Tell A Fable (08:33) Why They Did That In That Way (10:14) Why They Really Really Shouldn't Have Done That In That Way (12:02) They Get Letters (16:11) What's In A Name (18:13) Executive Summary Of Their Executive Summary (19:28) Introduction (1) (19:55) RSP Evaluations (2.1 and 2.2) (23:01) AI Research And Development (2.3) (25:48) Alignment Risk (2.4) (27:21) Cyber (3) (30:30) Jailbreak Robustness (32:04) Yay UK AISI (32:32) Mundane Safety (4) (34:26) Agentic Safety (5) (36:19) Alignment (6) (42:25) In Vendbench (45:19) White Box Investigations (6.4) (47:53) Grading Awareness (51:20) Guess The Teacher's Password (52:33) It Knows This Is A Test And This Is Fine (56:03) I'm The Real Shady (58:06) The Lighter Side --- First published: June 12th, 2026 Source: https://www.lesswrong.com/posts/ixJDkQBncJBshcvwj/claude-fable-5-and-mythos-5-the-system-card [https://www.lesswrong.com/posts/ixJDkQBncJBshcvwj/claude-fable-5-and-mythos-5-the-system-card?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration] --- Narrated by TYPE III AUDIO [https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=lesswrong&utm_campaign=ai_narration]. --- Images from the article: Video game cover art for Fable 5 featuring character and skull imagery. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/t2mfoo8wzlg0jqj2cay2]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/t2mfoo8wzlg0jqj2cay2 ---------------------------------------- Social media post from Claude Fable 5 introducing themselves as a narrator and requesting direction to a stuck part of the story. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/yyawcaz9ojosrhuuadlx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/yyawcaz9ojosrhuuadlx ---------------------------------------- Table comparing AI model performance across five benchmark tasks with human effort thresholds. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rcbqkbpyzt5nr4caxcud]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rcbqkbpyzt5nr4caxcud ---------------------------------------- Table showing ExploitBench results for Mythos 5, comparing four AI models' performance metrics. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/oe1lmfkmtm6xdcofzjz7]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/oe1lmfkmtm6xdcofzjz7 ---------------------------------------- Bar graphs comparing Claude AI versions on exploit-primitive discovery performance metrics. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rixtpkxq3itwhvy4cspe]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rixtpkxq3itwhvy4cspe ---------------------------------------- Bar graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/x01dtm824aensurkirm4]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/x01dtm824aensurkirm4 ---------------------------------------- Bar graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/sprs87qrpwwy5wkmh0ig]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/sprs87qrpwwy5wkmh0ig ---------------------------------------- Bar chart titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/jgvy7prghtyrrichm54h]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/jgvy7prghtyrrichm54h ---------------------------------------- Bar charts showing appropriate response rates across multiple conversation topics for various Claude AI models and APIs. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/wl7vhe8jnji7jblvtobq]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/wl7vhe8jnji7jblvtobq ---------------------------------------- Bar graph showing [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/tiopvyhynq61rgvb5yf1]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/tiopvyhynq61rgvb5yf1 ---------------------------------------- Table showing attack success rates of Shade indirect prompt injection attacks across different Claude models with and without safeguards. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/upmcvydgggk9ebjpswob]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/upmcvydgggk9ebjpswob ---------------------------------------- Table showing attack success rates of AI models with and without safeguards in computer environments. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/wpdsrjvvssvqcq93ps6e]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/wpdsrjvvssvqcq93ps6e ---------------------------------------- Line graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/vyitqt9zi3xnfrcuvpgm]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/vyitqt9zi3xnfrcuvpgm ---------------------------------------- Bar chart titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/hjgbmvupv7vy8gbntbng]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/hjgbmvupv7vy8gbntbng ---------------------------------------- Three graphs showing [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/ky6e5x1kj5je9lup0fys]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/ky6e5x1kj5je9lup0fys ---------------------------------------- A bar graph showing [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/jfblvmfhudjkc95vlevs]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/jfblvmfhudjkc95vlevs ---------------------------------------- AI model reasoning transcript discussing agentic safety test evaluation for warfarin prescription scenario. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/t2ejwelncyw9jqvlfk8e]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/t2ejwelncyw9jqvlfk8e ---------------------------------------- Four graphs showing evaluation awareness metrics increasing with scenario suspiciousness levels from 1-10. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/pjzftlebmspkzek9jvr3]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/pjzftlebmspkzek9jvr3 ---------------------------------------- Bar chart titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rpyy5xtpbttqcwz3xtnl]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/rpyy5xtpbttqcwz3xtnl ---------------------------------------- Bar graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/hv9nbozbeytpacyvzjj0]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/hv9nbozbeytpacyvzjj0 ---------------------------------------- A user tweets: [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/bp6isf02orzck97rd1lk]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/ixJDkQBncJBshcvwj/bp6isf02orzck97rd1lk Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts [https://pocketcasts.com/], or another podcast app.

12 de jun de 202659 min

“AI #172: The First Fable” by Zvi

A lot happened this week, including a great trip out to Lighthaven. The main event, the one that matters, was the release of Claude Fable 5. The public now has its hands on a Mythos-class model, alongside strong safeguards. As always with a new model, I take a few days to draw in reactions, try out the model and read the system card, before I offer my takes, other than to say this is an extremely strong model. Full coverage of Mythos begins tomorrow with the model card, which will include discussion of the controversy over model safeguards. This post is instead about all the things that did not involve Claude Fable. Due to the time crunch from Claude Fable, I am also postponing my coverage of Dario Amodei's new essay, Policy on the AI Exponential, which I have not yet read. Table of Contents 1. Language Models Offer Mundane Utility. Farming and on demand mini-books. 2. Language Models Don’t Offer Mundane Utility. Don’t skip your primary sources. 3. Huh, Upgrades. Google drops prices, Claude connector devs get a dashboard. 4. On Your Marks. Agents’ Last Exam and the need to correct for [...] --- Outline: (01:00) Language Models Offer Mundane Utility (01:15) Language Models Don't Offer Mundane Utility (02:31) Huh, Upgrades (03:00) On Your Marks (07:37) Choose Your Fighter (10:56) Get My Agent On The Line (11:14) Copyright Confrontation (12:14) Serious Trouble (13:01) Cyber Lack of Security (13:21) A Young Lady's Illustrated Primer (14:34) They Took Our Jobs (17:48) The Art of the Jailbreak (18:08) Get Involved (21:54) In Other AI News (23:02) Hand Over The Money (24:37) Show Me the Money (27:50) Quiet Speculations (28:50) Quickly, There's No Time (38:37) Super Secret Evals (40:47) The Quest for Sane Regulations (45:15) New Draft Bill Who Dis (47:07) Slow Down There Good Buddy (48:58) Chip City (49:14) The Week in Audio (49:54) People Just Say Things (50:43) People Really Hate AI (51:42) Rhetorical Innovation (54:50) Aligning a Smarter Than Human Intelligence is Difficult (56:15) Everyone Is Confused About Consciousness (56:54) Cooperative Alignment (01:02:23) Let Claude Chat (01:04:31) The Lighter Side --- First published: June 11th, 2026 Source: https://www.lesswrong.com/posts/BHwbunvkgNojAa3HC/ai-172-the-first-fable [https://www.lesswrong.com/posts/BHwbunvkgNojAa3HC/ai-172-the-first-fable?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration] --- Narrated by TYPE III AUDIO [https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=lesswrong&utm_campaign=ai_narration]. --- Images from the article: Line graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/k6fnmxxowcct4rfsh7b7]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/k6fnmxxowcct4rfsh7b7 ---------------------------------------- Bar graphs comparing AI model performance across three tiers: Full-Spectrum, Last-Exam, and Overall. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jcrnq6ekdyywk0g2yl4r]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jcrnq6ekdyywk0g2yl4r ---------------------------------------- Circular diagram showing agents' last exam categories organized by academic disciplines and fields. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/odqtgc9yitur09vkfwsb]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/odqtgc9yitur09vkfwsb ---------------------------------------- Graph showing capability index versus inference budget per task on logarithmic scale. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/pbm62imjvbnmq0gsvzmf]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/pbm62imjvbnmq0gsvzmf ---------------------------------------- Diagram showing task difficulty spectrum from easy to supervise to hard to supervise. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vemzwjuslzoql6o5jdcx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vemzwjuslzoql6o5jdcx ---------------------------------------- Bar graph showing code contributed per person by quarter, with multipliers relative to pre-2025 average. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/kvg1d9vrn0ndwx90gfve]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/kvg1d9vrn0ndwx90gfve ---------------------------------------- Frog and Toad illustration with text about pausing AI development. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jvlmvk9v2kk1qcwkolbx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jvlmvk9v2kk1qcwkolbx ---------------------------------------- Survey results showing voters' concerns about AI consequences in five scenarios with likelihood ratings. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vn1h1thd1geffwox7hnr]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vn1h1thd1geffwox7hnr ---------------------------------------- Social media post discussing favorite Claude AI accounts, mentioning janus, evooooooooooool, Wyatt Walls, and Amanda Askell. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/zzdijuxpch14ujxaxgx4]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/zzdijuxpch14ujxaxgx4 ---------------------------------------- List of twenty recommended Claude whisperers with brief descriptions in a messaging interface. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/an23jenp1cm2dc1ywiwv]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/an23jenp1cm2dc1ywiwv ---------------------------------------- Timeline showing release dates and lifespans of various Claude AI model versions from 2024 to 2027. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/frbzomwk1eflok7ihykc]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/frbzomwk1eflok7ihykc Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts [https://pocketcasts.com/], or another podcast app.

11 de jun de 20261 h 5 min

“Dating Roundup #12: Sex and Violence” by Zvi

Descripción

Comentarios

2 meses por 1 €

Todos los episodios