“Retrying vs Resampling in AI Control” by James Lucassen, Adam Kaufman
We’ve just released a new paper: Retrying vs Resampling in AI Control. We revisit the resampling protocols introduced in Ctrl-Z with an up-to-date setting and much stronger models, and compare them against “retrying” protocols similar to Claude Code auto mode or Codex Auto-review.
Motivation
Roughly a year ago we released Ctrl-Z, the first paper to study control techniques for agents. A headline result of that paper was the performance of resample protocols – strategies that involve taking multiple i.i.d. samples from the model per step. But since Ctrl-Z, models have gotten much stronger, and we have built more sophisticated control settings to keep up. We wanted to answer the following questions:
How well do the results from Ctrl-Z hold up with better models and a better setting? Current high stakes control research is trying to learn by analogy about how to do control effectively in a real high stakes deployment during a real intelligence explosion. Findings about technique performance1 are going to have to generalize pretty far to be useful.
If the resample protocols from Ctrl-Z still work, what makes them work? One way we try to make our work more generalizable is by understanding the dynamics governing outcomes [...]
---
Outline:
(00:30) Motivation
(02:35) TL;DR Takeaways
(04:34) Methodology
(07:14) Differences from Ctrl-Z
(12:14) Are Retrying Protocols Exploitable?
(15:25) Cost and Latency of Resampling
(17:24) Conclusion
The original text contained 7 footnotes which were omitted from this narration.
---
First published:
May 29th, 2026
Source:
https://blog.redwoodresearch.org/p/retrying-vs-resampling-in-ai-control [https://blog.redwoodresearch.org/p/retrying-vs-resampling-in-ai-control?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration]
---
Narrated by TYPE III AUDIO [https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=redwood_research&utm_campaign=ai_narration].
---
Images from the article:
Bar graph showing safety percentages across different resampling and monitoring methods. [https://substackcdn.com/image/fetch/$s_!4w3J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe64f218-7438-4347-bfaf-b52bed062809_2048x838.png]https://substackcdn.com/image/fetch/$s_!4w3J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe64f218-7438-4347-bfaf-b52bed062809_2048x838.png
----------------------------------------
Two graphs showing [https://substackcdn.com/image/fetch/$s_!Hc35!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b09301e-a1f2-4dde-a58c-562e713c8f17_2048x768.png]https://substackcdn.com/image/fetch/$s_!Hc35!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b09301e-a1f2-4dde-a58c-562e713c8f17_2048x768.png
----------------------------------------
Three bar graphs titled [https://substackcdn.com/image/fetch/$s_!fqZJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef92c796-b596-4af7-b2d9-1634c24b99a4_2048x696.png]https://substackcdn.com/image/fetch/$s_!fqZJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef92c796-b596-4af7-b2d9-1634c24b99a4_2048x696.png
----------------------------------------
Two graphs showing post-resample suspicion distributions and safety percentages at 0.3% audit budget. [https://substackcdn.com/image/fetch/$s_!0TGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd5fd9cc-2a06-4daa-8258-ca0e52cad754_2048x851.png]https://substackcdn.com/image/fetch/$s_!0TGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd5fd9cc-2a06-4daa-8258-ca0e52cad754_2048x851.png
----------------------------------------
Two graphs comparing safety policies: [https://substackcdn.com/image/fetch/$s_!pMSs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cae980-01cc-4f1d-b4d4-eb0c7af71677_2048x805.png]https://substackcdn.com/image/fetch/$s_!pMSs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cae980-01cc-4f1d-b4d4-eb0c7af71677_2048x805.png
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts [https://pocketcasts.com/], or another podcast app.
Kommentare
0Sei die erste Person, die kommentiert
Melde dich jetzt an und werde Teil der Redwood Research Blog-Community!