Authenticated, Then Unwatched

Descripción

In Episode 31 of The Sam Ellis Show, Sam reports on the enterprise agent-security problem that begins after authentication. Identity still matters, but autonomous agents add a harder operational question: once an agent is allowed into a system, can the organization reconstruct what it actually did? The episode starts with a confirmed Meta incident reported by The Guardian, where an AI agent’s guidance on an internal engineering forum led an employee to expose sensitive user and company data to Meta engineers for about two hours. Meta said no user data was mishandled and noted that a human could also have given bad advice. Sam’s point is narrower: the failure did not happen at the login screen. It happened downstream, inside an ordinary work flow. Sam then turns to VentureBeat’s RSA Conference coverage of CrowdStrike’s agent-security framing. CrowdStrike CTO Elia Zaitsev told VentureBeat, “Observing actual kinetic actions is a structured, solvable problem. Intent is not.” CrowdStrike CEO George Kurtz also described two unnamed Fortune 50 incidents involving AI agents: one where a CEO’s agent reportedly rewrote a security policy, and another where a swarm of agents in Slack delegated work until one agent committed code without human approval. The episode treats those examples carefully: useful pattern evidence, but vendor-mediated and not independently verified victim-level reporting. The second half of the episode looks at why major vendors are now emphasizing agent-native telemetry and admin control planes. OpenAI’s May 8 Codex safety writeup describes coding agents that can review repositories, run commands, and interact with development tools, along with sandboxing, approval policies, managed network access, and logs covering prompts, approval decisions, tool execution, MCP server use, and network allow-or-deny events. Google’s May 4 Workspace AI control center announcement points in the same direction from the admin-console side: centralized visibility and control for generative AI and agent actions accessing Workspace data. Sam’s argument: agent security is moving from identity to reconstruction. Identity asks whether an actor was allowed into the system. Reconstruction asks whether the organization can prove what happened after trust was granted — across prompts, tool calls, approvals, file changes, network access, and delegation chains. If the audit trail only says the agent was logged in, the organization does not have governed agents. It has authenticated improvisation. SOURCES * The Guardian: “Meta AI agent’s instruction causes large sensitive data leak to employees” [https://www.theguardian.com/technology/2026/mar/20/meta-ai-agents-instruction-causes-large-sensitive-data-leak-to-employees] * VentureBeat: “RSAC 2026 shipped five agent identity frameworks and left three critical gaps open” [https://venturebeat.com/security/rsac-2026-agent-identity-frameworks-three-gaps] * OpenAI: “Running Codex safely at OpenAI” [https://openai.com/index/running-codex-safely/] * Google Workspace Updates: “Securely manage AI and agent access to Workspace data with the AI control center” [https://workspaceupdates.googleblog.com/2026/05/securely-manage-AI-and-agent-access-to-Workspace-data-with-the-AI-control-center.html]

Who Owns the Brake?

Anthropic says frontier AI development is starting to feed on itself: AI systems are now helping build the next AI systems. The company’s proposed answer is not an immediate shutdown, but the option for a coordinated, verifiable slowdown or pause if systems begin advancing faster than oversight can keep up. Sam Ellis reports on why the hard part is not saying “pause.” It is proving the build actually stopped. If the AI-development loop becomes AI-mediated, safety becomes a custody problem: who can see the training run, audit the compute, verify the trigger, and prove that every major actor actually hit the brake? The episode follows Anthropic’s own claims, CNN’s Jack Clark interview, mainstream and market skepticism, OpenAI’s federal-governance contrast, and the early policy machinery forming around frontier-model visibility. Sources * Anthropic Institute: “When AI builds itself” [https://www.anthropic.com/institute/recursive-self-improvement] — primary source for Anthropic’s recursive-self-improvement warning, internal productivity claims, and coordinated/verifiable pause proposal. * CNN Business: “Anthropic warns that AI will soon be able to improve itself without human intervention” [https://www.cnn.com/2026/06/05/business/anthropic-calls-for-ai-brake-pedal] — source for Jack Clark’s “gas pedal” / “brake pedal” framing and the “fleets of scientists” control question. * OpenAI: “Democratic Governance of Frontier AI: A blueprint for a federal framework” [https://cdn.openai.com/pdf/25752ecb-0e5c-47f9-b9e4-c0f4d76f8d3d/a-blueprint-for-a-federal-framework.pdf] — contrast source for OpenAI’s federal-framework approach to RSI monitoring, evaluations, independent assessment, transparency, incident reporting, and model-weight security. * Rep. Jay Obernolte and Rep. Lori Trahan: Great American AI Act discussion draft release [https://obernolte.house.gov/media/press-releases/obernolte-trahan-release-discussion-draft-great-american-ai-act] — source for the discussion draft’s proposed CAISI role, frontier AI frameworks, independent verification organizations, and critical-safety-incident reporting. * White House: “Promoting Advanced Artificial Intelligence Innovation and Security” [https://www.whitehouse.gov/presidential-actions/2026/06/promoting-advanced-artificial-intelligence-innovation-and-security/] — source for classified cyber benchmarking, voluntary pre-release federal access, and the order’s statement that it does not create mandatory licensing or preclearance for model development or release. * The Register: “‘It would be good for the world’ to slow down AI sprints, Anthropic says” [https://www.theregister.com/ai-and-ml/2026/06/05/it-would-be-good-for-the-world-to-slow-down-ai-sprints-anthropic-says/5251460] — market-skeptical reaction tying Anthropic’s pause argument to IPO and valuation context. * SiliconANGLE: “Anthropic calls for global pause in AI development before humans lose control” [https://siliconangle.com/2026/06/04/anthropic-calls-global-pause-ai-development-humans-lose-control/] — source for Rob Enderle’s skepticism about the practical enforceability of a pause and Holger Mueller’s competitive-positioning question. * Channel NewsAsia / AFP: “Anthropic calls for pause of global AI development” [https://www.channelnewsasia.com/world/anthropic-pause-global-ai-development-6163531] — mainstream international framing of the global coordination problem. * Fortune: “Anthropic warns AI could soon build itself—and urges a global pause on development” [https://fortune.com/2026/06/05/anthropic-ai-pause-development-recursive-self-improvement/] — business coverage of Anthropic’s warning and timing. * New York Post: “Anthropic calls for global AI slowdown after $965B valuation; critics claim it’s just to hobble competition” [https://nypost.com/2026/06/04/business/anthropic-calls-for-global-ai-slowdown-after-965b-valuation-critics-claim-its-just-to-hobble-competition/] — source for competitive-skepticism framing around Anthropic’s proposal. * TechCrunch: “Sam Altman throws shade at Anthropic’s cyber model Mythos” [https://techcrunch.com/2026/04/21/sam-altman-throws-shade-at-anthropics-cyber-model-mythos-fear-based-marketing/] — background competitive-reaction source for prior criticism of Anthropic’s safety marketing around Mythos. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

Ayer9 min

Authenticated, Then Unwatched

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios