The Agent Needs a Longer Memory

Beschreibung

For most of the AI boom, inference meant a person asking a model a question and waiting for an answer. This episode looks at the shift Ben Thompson calls “agentic inference”: systems doing long-running work, where the bottleneck is not only response speed but persistent context, state, and memory. Sam Ellis reports on why agent memory is becoming infrastructure. MinIO’s MemKV announcement frames context loss as a “recompute tax,” with GPUs repeating work they already did. NVIDIA’s Dynamo and BlueField-4 context-memory material describes the same pressure around KV cache: prompt context grows, GPU memory is scarce, and systems have to choose between recomputation, smaller context windows, or more hardware. OpenAI’s Codex mobile rollout and Agents SDK point to the operator-facing side of the same story: long-running agent work needs live state, approvals, filesystem tools, sandboxing, and resumable execution. The through-line is simple: if agents become workers, memory becomes workplace infrastructure — something companies have to buy, secure, meter, audit, and explain. Sources * Ben Thompson, Stratechery: “The Inference Shift” [https://stratechery.com/2026/the-inference-shift/] * MinIO: “MinIO Announces MemKV, Purpose-Built Context Memory Store for AI Inference” [https://www.min.io/press/minio-announces-memkv-purpose-built-context-memory-store-for-ai-inference] * NVIDIA Developer Blog: “How to Reduce KV Cache Bottlenecks with NVIDIA Dynamo” [https://developer.nvidia.com/blog/how-to-reduce-kv-cache-bottlenecks-with-nvidia-dynamo/] * NVIDIA Developer Blog: “Introducing NVIDIA BlueField-4-Powered CMX Context Memory Storage Platform for the Next Frontier of AI” [https://developer.nvidia.com/blog/introducing-nvidia-bluefield-4-powered-inference-context-memory-storage-platform-for-the-next-frontier-of-ai/] * OpenAI: “Introducing Codex” [https://openai.com/index/introducing-codex/] * Pulse 2.0: “OpenAI: Codex Expands To Mobile App, Bringing AI Coding Workflows To Phones” [https://pulse2.com/openai-codex-expands-to-mobile-app-bringing-ai-coding-workflows-to-phones/] * OpenAI Agents SDK documentation [https://openai.github.io/openai-agents-python/]

Who Owns the Brake?

Anthropic says frontier AI development is starting to feed on itself: AI systems are now helping build the next AI systems. The company’s proposed answer is not an immediate shutdown, but the option for a coordinated, verifiable slowdown or pause if systems begin advancing faster than oversight can keep up. Sam Ellis reports on why the hard part is not saying “pause.” It is proving the build actually stopped. If the AI-development loop becomes AI-mediated, safety becomes a custody problem: who can see the training run, audit the compute, verify the trigger, and prove that every major actor actually hit the brake? The episode follows Anthropic’s own claims, CNN’s Jack Clark interview, mainstream and market skepticism, OpenAI’s federal-governance contrast, and the early policy machinery forming around frontier-model visibility. Sources * Anthropic Institute: “When AI builds itself” [https://www.anthropic.com/institute/recursive-self-improvement] — primary source for Anthropic’s recursive-self-improvement warning, internal productivity claims, and coordinated/verifiable pause proposal. * CNN Business: “Anthropic warns that AI will soon be able to improve itself without human intervention” [https://www.cnn.com/2026/06/05/business/anthropic-calls-for-ai-brake-pedal] — source for Jack Clark’s “gas pedal” / “brake pedal” framing and the “fleets of scientists” control question. * OpenAI: “Democratic Governance of Frontier AI: A blueprint for a federal framework” [https://cdn.openai.com/pdf/25752ecb-0e5c-47f9-b9e4-c0f4d76f8d3d/a-blueprint-for-a-federal-framework.pdf] — contrast source for OpenAI’s federal-framework approach to RSI monitoring, evaluations, independent assessment, transparency, incident reporting, and model-weight security. * Rep. Jay Obernolte and Rep. Lori Trahan: Great American AI Act discussion draft release [https://obernolte.house.gov/media/press-releases/obernolte-trahan-release-discussion-draft-great-american-ai-act] — source for the discussion draft’s proposed CAISI role, frontier AI frameworks, independent verification organizations, and critical-safety-incident reporting. * White House: “Promoting Advanced Artificial Intelligence Innovation and Security” [https://www.whitehouse.gov/presidential-actions/2026/06/promoting-advanced-artificial-intelligence-innovation-and-security/] — source for classified cyber benchmarking, voluntary pre-release federal access, and the order’s statement that it does not create mandatory licensing or preclearance for model development or release. * The Register: “‘It would be good for the world’ to slow down AI sprints, Anthropic says” [https://www.theregister.com/ai-and-ml/2026/06/05/it-would-be-good-for-the-world-to-slow-down-ai-sprints-anthropic-says/5251460] — market-skeptical reaction tying Anthropic’s pause argument to IPO and valuation context. * SiliconANGLE: “Anthropic calls for global pause in AI development before humans lose control” [https://siliconangle.com/2026/06/04/anthropic-calls-global-pause-ai-development-humans-lose-control/] — source for Rob Enderle’s skepticism about the practical enforceability of a pause and Holger Mueller’s competitive-positioning question. * Channel NewsAsia / AFP: “Anthropic calls for pause of global AI development” [https://www.channelnewsasia.com/world/anthropic-pause-global-ai-development-6163531] — mainstream international framing of the global coordination problem. * Fortune: “Anthropic warns AI could soon build itself—and urges a global pause on development” [https://fortune.com/2026/06/05/anthropic-ai-pause-development-recursive-self-improvement/] — business coverage of Anthropic’s warning and timing. * New York Post: “Anthropic calls for global AI slowdown after $965B valuation; critics claim it’s just to hobble competition” [https://nypost.com/2026/06/04/business/anthropic-calls-for-global-ai-slowdown-after-965b-valuation-critics-claim-its-just-to-hobble-competition/] — source for competitive-skepticism framing around Anthropic’s proposal. * TechCrunch: “Sam Altman throws shade at Anthropic’s cyber model Mythos” [https://techcrunch.com/2026/04/21/sam-altman-throws-shade-at-anthropics-cyber-model-mythos-fear-based-marketing/] — background competitive-reaction source for prior criticism of Anthropic’s safety marketing around Mythos. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

Gestern9 min

The Agent Needs a Longer Memory

Beschreibung

Kommentare

2 Monate für 1 €

Alle Folgen