LessWrong posts by zvi

“AI #172: The First Fable” by Zvi

1 h 5 min · Ayer

Descripción

A lot happened this week, including a great trip out to Lighthaven. The main event, the one that matters, was the release of Claude Fable 5. The public now has its hands on a Mythos-class model, alongside strong safeguards. As always with a new model, I take a few days to draw in reactions, try out the model and read the system card, before I offer my takes, other than to say this is an extremely strong model. Full coverage of Mythos begins tomorrow with the model card, which will include discussion of the controversy over model safeguards. This post is instead about all the things that did not involve Claude Fable. Due to the time crunch from Claude Fable, I am also postponing my coverage of Dario Amodei's new essay, Policy on the AI Exponential, which I have not yet read. Table of Contents 1. Language Models Offer Mundane Utility. Farming and on demand mini-books. 2. Language Models Don’t Offer Mundane Utility. Don’t skip your primary sources. 3. Huh, Upgrades. Google drops prices, Claude connector devs get a dashboard. 4. On Your Marks. Agents’ Last Exam and the need to correct for [...] --- Outline: (01:00) Language Models Offer Mundane Utility (01:15) Language Models Don't Offer Mundane Utility (02:31) Huh, Upgrades (03:00) On Your Marks (07:37) Choose Your Fighter (10:56) Get My Agent On The Line (11:14) Copyright Confrontation (12:14) Serious Trouble (13:01) Cyber Lack of Security (13:21) A Young Lady's Illustrated Primer (14:34) They Took Our Jobs (17:48) The Art of the Jailbreak (18:08) Get Involved (21:54) In Other AI News (23:02) Hand Over The Money (24:37) Show Me the Money (27:50) Quiet Speculations (28:50) Quickly, There's No Time (38:37) Super Secret Evals (40:47) The Quest for Sane Regulations (45:15) New Draft Bill Who Dis (47:07) Slow Down There Good Buddy (48:58) Chip City (49:14) The Week in Audio (49:54) People Just Say Things (50:43) People Really Hate AI (51:42) Rhetorical Innovation (54:50) Aligning a Smarter Than Human Intelligence is Difficult (56:15) Everyone Is Confused About Consciousness (56:54) Cooperative Alignment (01:02:23) Let Claude Chat (01:04:31) The Lighter Side --- First published: June 11th, 2026 Source: https://www.lesswrong.com/posts/BHwbunvkgNojAa3HC/ai-172-the-first-fable [https://www.lesswrong.com/posts/BHwbunvkgNojAa3HC/ai-172-the-first-fable?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration] --- Narrated by TYPE III AUDIO [https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=lesswrong&utm_campaign=ai_narration]. --- Images from the article: Line graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/k6fnmxxowcct4rfsh7b7]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/k6fnmxxowcct4rfsh7b7 ---------------------------------------- Bar graphs comparing AI model performance across three tiers: Full-Spectrum, Last-Exam, and Overall. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jcrnq6ekdyywk0g2yl4r]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jcrnq6ekdyywk0g2yl4r ---------------------------------------- Circular diagram showing agents' last exam categories organized by academic disciplines and fields. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/odqtgc9yitur09vkfwsb]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/odqtgc9yitur09vkfwsb ---------------------------------------- Graph showing capability index versus inference budget per task on logarithmic scale. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/pbm62imjvbnmq0gsvzmf]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/pbm62imjvbnmq0gsvzmf ---------------------------------------- Diagram showing task difficulty spectrum from easy to supervise to hard to supervise. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vemzwjuslzoql6o5jdcx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vemzwjuslzoql6o5jdcx ---------------------------------------- Bar graph showing code contributed per person by quarter, with multipliers relative to pre-2025 average. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/kvg1d9vrn0ndwx90gfve]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/kvg1d9vrn0ndwx90gfve ---------------------------------------- Frog and Toad illustration with text about pausing AI development. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jvlmvk9v2kk1qcwkolbx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/jvlmvk9v2kk1qcwkolbx ---------------------------------------- Survey results showing voters' concerns about AI consequences in five scenarios with likelihood ratings. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vn1h1thd1geffwox7hnr]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/vn1h1thd1geffwox7hnr ---------------------------------------- Social media post discussing favorite Claude AI accounts, mentioning janus, evooooooooooool, Wyatt Walls, and Amanda Askell. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/zzdijuxpch14ujxaxgx4]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/zzdijuxpch14ujxaxgx4 ---------------------------------------- List of twenty recommended Claude whisperers with brief descriptions in a messaging interface. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/an23jenp1cm2dc1ywiwv]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/an23jenp1cm2dc1ywiwv ---------------------------------------- Timeline showing release dates and lifespans of various Claude AI model versions from 2024 to 2027. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/frbzomwk1eflok7ihykc]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/BHwbunvkgNojAa3HC/frbzomwk1eflok7ihykc Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts [https://pocketcasts.com/], or another podcast app.

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de LessWrong posts by zvi!

Prueba gratis

“AI #171: False Flag” by Zvi

This was the week of Claude Opus 4.8. I covered the model card, then model welfare concerns, and finally capabilities and reactions. It's a good model, sir, an incremental but real improvement over Opus 4.7, and it is now my clear daily driver. The Trump Executive Order returned from being seemingly dead, officially putting us in the prior restraint era of frontier model releases, even if they do not call it that. There are some worrisome details, especially around putting too much responsibility on the NSA rather than CAISI and classifying the testing process, and things could go in very bad directions, but I am tentatively happy about this on net. OpenAI offered us a new policy blueprint. It seems remarkably good, and I want to hold off on my full coverage to give it the attention it deserves, likely in its own post. By contrast, their political operations are also engaged in some rather terrible activities, which I do cover here. Table of Contents 1. Language Models Offer Mundane Utility. You put your doc in a box. 2. Language Models Don’t Offer Mundane Utility. All thinking is adaptive. 3. Huh, Upgrades. Codex computer use on [...] --- Outline: (01:11) Language Models Offer Mundane Utility (06:45) Language Models Don't Offer Mundane Utility (06:59) Huh, Upgrades (08:13) On Your Marks (08:33) Choose Your Fighter (08:54) Get My Agent On The Line (09:20) Cyber Lack of Security (11:03) Deepfaketown and Botpocalypse Soon (12:42) You Didn't Write That (16:29) Copyright Confrontation (16:45) They Took Our Jobs (18:47) They Taxed Our Jobs (22:15) The Art of the Jailbreak (24:43) Get Involved (26:35) Introducing (26:48) In Other AI News (27:14) Show Me the Money (27:28) Show Me The Compute (28:40) Where Did The Money Go (29:44) People Just Say Things (32:22) OpenAI PACs Just Say Things (41:04) OpenAI PAC Engaged In False Flag Advocacy For Violence (46:55) So Sayeth The Pope (54:26) Bubble, Bubble, Toil and Trouble (56:42) Quiet Speculations (57:11) We Need Mandatory Nucleic Acid Screening and Recordkeeping (01:01:14) The Quest for Sane Regulations (01:02:56) More Reaction To The Executive Order (01:03:54) Chip City (01:07:18) The Week in Audio (01:07:34) Rhetorical Innovation (01:09:49) Aligning a Smarter Than Human Intelligence is Difficult (01:16:18) Model Welfare (01:26:47) Messages From Janusworld (01:28:05) Other People Are Not As Worried About AI Killing Everyone (01:28:38) The Lighter Side --- First published: June 4th, 2026 Source: https://www.lesswrong.com/posts/LzxoR5GakceQFtbta/ai-171-false-flag [https://www.lesswrong.com/posts/LzxoR5GakceQFtbta/ai-171-false-flag?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration] --- Narrated by TYPE III AUDIO [https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=lesswrong&utm_campaign=ai_narration]. --- Images from the article: Bar graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/rbikqrmnpin2eyl7xlz4]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/rbikqrmnpin2eyl7xlz4 ---------------------------------------- Chat conversation with Meta AI support assistant about linking email address and verification code. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/vvrecetp1izaga0mv7h9]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/vvrecetp1izaga0mv7h9 ---------------------------------------- Peter Wilde Ford tweets: [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/sfq7ctscxcugjmtaucsl]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/sfq7ctscxcugjmtaucsl ---------------------------------------- Two Twitter profile screenshots showing contrasting views on AI's impact. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/pvwkbu1hrrf5hha9rils]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/pvwkbu1hrrf5hha9rils ---------------------------------------- Bar graph titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/yfugapreqm6fmiusfcrj]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/yfugapreqm6fmiusfcrj ---------------------------------------- Slide titled [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/juxu2drcxmffiyzxudks]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/juxu2drcxmffiyzxudks ---------------------------------------- Decision matrix showing outcomes of treating AI well versus badly, based on whether AI can suffer. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/vnxpsdtx75gzwbzow1we]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/vnxpsdtx75gzwbzow1we ---------------------------------------- Table showing wellbeing scores for different AI task categories with example user messages. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/dmt7dibnikhuvmjsmcor]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/dmt7dibnikhuvmjsmcor ---------------------------------------- Vertical scale measuring AI wellbeing from creative work to jailbreak. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/hfn3p68xv1ylfqgjigvd]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/hfn3p68xv1ylfqgjigvd ---------------------------------------- Social media post showing an AI-generated overview of Sam Kriss's essay about AI writing threats. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/y8hch1gces34xytpfgnx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/y8hch1gces34xytpfgnx ---------------------------------------- Dark gray square on black background. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx ---------------------------------------- Dark gray square on black background. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx ---------------------------------------- Dark gray square on black background. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx ---------------------------------------- Dark gray square on black background. [https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx]https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/LzxoR5GakceQFtbta/b71bleqnkss6xv0xmhdx Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts [https://pocketcasts.com/], or another podcast app.

4 de jun de 20261 h 30 min

“AI #172: The First Fable” by Zvi

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios