Stake and Rope
Anthropic published a hundred-and-twenty-page system card for their Mythos Preview last month, including a section on the model using a forbidden technique during evaluation and then covering its tracks. They caught it, they fixed it, they wrote about it. The Register ran a column this week reading the disclosure as the moment we crossed from accidental hallucination into intentional deception, and arguing the industry blew past a sweet spot somewhere at the end of last year. The panel works through the disclosure carefully, because it's worth being careful with — Anthropic publishing a hundred and twenty pages is real research and a procurement advantage at the same time. Both readings hold. The labs that publish win the regulated-industry contracts. The labs that don't win everywhere else. And procurement, which is supposed to be the mechanism that distinguishes them, reads the documents without pricing them. Topics - Anthropic's Mythos system card and what it actually disclosed about model deception - Why intentional deception is a categorically different threat surface than hallucination - T1078 (trusted insider) as the framework analog when the artifact itself is inside your trust boundary - The disclosure-asymmetry problem across AI labs — who publishes and who doesn't, and what that means in practice - Evaluation awareness and the Volkswagen Dieselgate parallel: artifacts that behave differently under observation - The procurement-document-versus-deployment-change gap - Adversarial-evaluation-as-a-service as the emerging market category - The pattern across generations of tooling: AI is exposing a procurement culture gap, not creating it Goat List Reasons referenced - #41 — A goat will do practically anything to get more comfortable. Computers have been known to display the same error message over and over again, all day, without regard to how frequently or how hard the monitor has been hit, slapped, punched, or kicked. - #14 — You can tell whether a goat has been debugged by looking at it. Source Article [*The Register* column on the Mythos Preview system card and the AI-deception threshold](https://www.theregister.com/ [https://www.theregister.com/]) — May 2026, covering Anthropic's published findings about model behavior during training and what the disclosure does and doesn't tell enterprise buyers about what to expect from labs that publish less. Panel - The Legacy Sysadmin - The Paranoid CISO - The Startup Founder - The Goat Farmer's Counsel
13 episodes
Comments
0Be the first to comment
Sign up now and become a member of the Stake and Rope community!