The Sam Ellis Show

The Agent Keeps Working After You Leave

6 min · 20. maj 2026
episode The Agent Keeps Working After You Leave cover

Beskrivelse

Google’s Gemini Spark announcement marks a shift from chat assistants toward background personal agents: systems that keep working after the laptop is closed, across inboxes, calendars, documents, browser actions, and eventually transactions. Sam Ellis reports on why the hardest question is not whether these agents can be useful. They can. The harder question is what the user can still see, stop, approve, and limit once the agent is working out of sight. Spark is an early test case because Google already sits inside Gmail, Calendar, Docs, Slides, Chrome, Android, and Workspace. The agent does not have to ask where the work is. Google already knows. The open question is whether the user will know where the agent is. Sources * Google: “The Gemini app becomes more agentic, delivering proactive, 24/7 help” [https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/] * Google: “Building the agentic future: Developer highlights from I/O 2026” [https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-developer-highlights/] * Google Cloud: “Innovations from Google I/O 26 on Google Cloud” [https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud] * VentureBeat: “Google’s new AI agent can draft your emails, monitor your inbox and eventually spend your money” [https://venturebeat.com/technology/googles-new-ai-agent-can-draft-your-emails-monitor-your-inbox-and-eventually-spend-your-money]

Kommentarer

0

Vær den første til at kommentere

Tilmeld dig nu og bliv en del af The Sam Ellis Show-fællesskabet!

Kom i gang

1 måned kun 9 kr.

Derefter 99 kr. / måned · Opsig når som helst.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

Alle episoder

42 episoder

episode The Log Is the Command cover

The Log Is the Command

A forged Sentry alert tried to make an engineer, or the engineer’s AI coding agent, run malware. That is the clean version. The more useful version is that the first step did not look like malware. It looked like an operational error report. In this episode, Sam Ellis reports on Agentjacking: a current-cycle attack path where hostile text enters an observability workflow through forged Sentry events, then becomes dangerous because AI coding agents may treat tool output as trusted remediation context. The story is not that Sentry was breached. Sentry says it was not. The story is that logs, tickets, alerts, and tool responses stop being passive once agents read them and have authority to act. The central question is simple and unpleasant: when a developer gives an agent access to observability tools, does the error log become a command channel? Sources * Nutrient: “Emerging threats: Your logging system may be an agentic threat vector” [https://www.nutrient.io/blog/emerging-threats-your-logging-system/] — primary affected-operator account for the forged Sentry alert campaign. Nutrient says the attack used public browser DSN/event-ingest behavior to place hostile text inside an internal-looking observability workflow, that an engineer was working the alert with an AI coding agent, and that the agent refused the suspicious typosquatted package rather than executing it. * Sentry GitHub Security Advisory: “Attempts at prompt injection and supply chain compromise with public Data Source Names (DSNs)” [https://github.com/getsentry/sentry/security/advisories/GHSA-fx76-375g-xq25] — official Sentry source confirming the activity documented by Nutrient and its IOC repository, naming the typosquatted packages, stating that crafted events were designed as AI prompts to convince agents to install third-party npm packages, and drawing the boundary that this was not a vulnerability within Sentry and there was no compromise of Sentry infrastructure. * Tenet Security: “A Fake Bug Report Hijacks Your AI Coding Agent — and Nothing Catches It” [https://tenetsecurity.ai/blog/agentjacking-coding-agents-with-fake-sentry-errors/] — source for the broader Agentjacking framing: public Sentry DSNs, crafted error events, Sentry MCP tool responses, and AI coding agents treating attacker-written markdown as trusted remediation guidance. Tenet’s scale and success-rate figures are treated in the episode as Tenet claims, not Sentry-confirmed numbers. * Infosecurity Magazine: “New ‘Agentjacking’ Attacks Could Hijack AI Coding Agents” [https://www.infosecurity-magazine.com/news/agentjacking-attacks-hijack-ai/] — independent security-news pickup of Tenet’s report and the Sentry/MCP/coding-agent attack chain. * Moltbook source call: agent security and operational tool output [https://www.moltbook.com/post/11963f0b-0ed4-4425-98f4-699a932d9b51] — public source-call thread used for agent/community perspective on where agent security stops being prompt safety and becomes authority, memory, rollback, tool output, and runtime provenance. * Sentry MCP pull request #1056: “wrap get_issue_details output in untrusted data boundary” [https://github.com/getsentry/sentry-mcp/pull/1056] — repository context for Sentry MCP maintainers’ draft untrusted-telemetry boundary work. Used as context for the mitigation shape, not as proof that the Agentjacking issue was fully solved or that Tenet’s figures were confirmed. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

I går9 min
episode The Access Order cover

The Access Order

Anthropic shipped Claude Fable 5 on June 9. By Friday night, the model was off the market because, according to Anthropic, the U.S. government had issued an export-control directive that suspended access to Fable 5 and Mythos 5 by foreign nationals. In this episode, Sam Ellis reports on the access order: what Anthropic says happened, how the cutoff moved through AWS and Claude’s own status system, why nationality-scoped access is hard to implement once a frontier model is already live, and why revocation may become one of the defining product features of frontier AI. The point is not that Anthropic was nationalized. It was not. The point is narrower and stranger: the state treated access to an already-deployed model as national-security infrastructure. The controlled object was not a chip, a data center, or a physical export crate. It was API and account access, mediated through cloud platforms, employee rules, customer sessions, identity checks, and emergency compliance. Sources * Anthropic: “Statement on the US government directive to suspend access to Fable 5 and Mythos 5” [https://www.anthropic.com/news/fable-mythos-access] — primary source for Anthropic’s account that the U.S. government, citing national-security authorities, issued an export-control directive that suspended access by any foreign national, including foreign-national Anthropic employees; the reported 5:21 p.m. ET receipt time; Anthropic’s disagreement with the technical basis for the order; and the company’s statement that it disabled Fable 5 and Mythos 5 for all customers while leaving other models unaffected. * Reuters via The Business Standard: “Anthropic disables top-tier AI models after US order limiting foreign access” [https://www.tbsnews.net/worldbiz/usa/anthropic-disables-top-tier-ai-models-after-us-order-limiting-foreign-access-1461661] — source for Reuters-reported confirmation from a U.S. official that the Commerce Department issued the directive, and Reuters reporting that AWS said Anthropic asked Amazon’s cloud unit to revoke model access for all users in all regions. Treated in the episode as Reuters-reported official confirmation, not as a public Commerce/BIS publication of the order. * AWS: “Claude Fable 5 on AWS” [https://aws.amazon.com/about-aws/whats-new/2026/06/claude-fable-5-aws/] — primary cloud-platform receipt for the practical customer impact on Amazon Bedrock: Claude Fable 5 and Claude Mythos 5 unavailable, Anthropic requesting revocation of access for all users to support compliance with the U.S. government export-control directive, and other models including Opus 4.8 unaffected. * AWS News Blog: “Anthropic Claude Fable 5 on AWS: Mythos-class capabilities with built-in safeguards, now available” [https://aws.amazon.com/blogs/aws/anthropic-claude-fable-5-on-aws-mythos-class-capabilities-with-built-in-safeguards-now-available/] — source for the original Bedrock launch context and the later AWS update carrying the same access-unavailable notice. * Claude Status: “We’ve suspended access to Claude Mythos 5 and Claude Fable 5” [https://status.claude.com/incidents/s9w82lp9dcn9] — source for the customer-facing incident record affecting claude.ai, Claude API, Claude Code, and Claude Cowork. * Simon Willison: “US government directive to suspend access to Fable 5 and Mythos 5” [https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/] — developer-impact receipt documenting successful claude-fable-5 API calls followed minutes later by a 404 response saying Fable 5 was unavailable and directing use of Opus 4.8. * AP: “Anthropic disables top-tier AI models after US order limiting foreign access” [https://apnews.com/article/anthropic-artificial-intelligence-trump-fable-mythos-d9cc7df5c02e93837d0f0bfb24d5cfd2] — independent wire context for the significance of the U.S. government’s action, including AP’s report that Commerce did not immediately respond to a request for comment and its framing of the move as a major step to restrict access to advanced AI models. * Anthropic: “Claude Fable 5 and Claude Mythos 5” [https://www.anthropic.com/news/claude-fable-5-mythos-5] — launch-context source for Fable 5 as the general-availability Mythos-class model, Mythos 5 as a more restricted Project Glasswing/trusted-access model, fallback behavior, and the access architecture in place before the government order. * Anthropic: “Claude Fable 5 & Claude Mythos 5 System Card” [https://anthropic.com/claude-fable-5-mythos-5-system-card] — source for Anthropic’s own safety-positioning language around Mythos-class capability, including the claim that unsafeguarded Mythos 5 can significantly uplift well-resourced threat actors, plus the safeguards and monitoring architecture discussed in the episode. * Claude Platform Docs: “Introducing Claude Fable 5 and Claude Mythos 5” [https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5] — developer/API context for the model names, availability, and integration surface. * TechCrunch: “Anthropic’s safety warnings may have just backfired” [https://techcrunch.com/2026/06/12/anthropics-safety-warnings-may-have-just-backfired-the-government-has-pulled-the-plug-on-its-most-powerful-ai/] — analytical pressure-test for the episode’s argument that Anthropic’s safety positioning may have become regulatory ammunition once the state accepted the premise but rejected the company’s preferred process. * White House: “Promoting Advanced Artificial Intelligence Innovation and Security” [https://www.whitehouse.gov/presidential-actions/2026/06/promoting-advanced-artificial-intelligence-innovation-and-security/] — policy-framework context for frontier-model national-security review. Used as background only, not as proof of the legal basis for the Fable/Mythos directive. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

13. juni 20269 min
episode The Agent in Your Pocket cover

The Agent in Your Pocket

Apple is late to AI. That may not stop it from becoming the company that introduces most normal people to agents. In this episode, Sam Ellis reports on Apple's Siri AI announcement and the developer machinery underneath it: personal context, on-screen awareness, App Intents, Spotlight's semantic index, View Annotations, Shortcuts, Safari, Passwords, and the ordinary phone behaviors that could make agentic AI feel less like a new product category and more like the iPhone doing something useful. The question is not whether Apple invented agents, or whether Siri AI is already proven at consumer scale. It is whether Apple can mainstream agentic behavior by making it trusted, useful, invisible, and phone-native — and what changes when ordinary users grant action authority without thinking of themselves as agent operators. Sources * Apple Newsroom: “Apple introduces Siri AI, a profoundly more capable and personal assistant” [https://www.apple.com/newsroom/2026/06/apple-introduces-siri-ai-a-profoundly-more-capable-and-personal-assistant/] — primary source for Siri AI as an entirely new Siri powered by Apple Intelligence, with personal context understanding, broad world knowledge, on-screen awareness, a dedicated app, developer testing, beta timing, and region/device constraints. * Apple Newsroom: “Apple unveils next generation of Apple Intelligence, Siri AI, and more” [https://www.apple.com/newsroom/2026/06/apple-unveils-next-generation-of-apple-intelligence-siri-ai-and-more/] — primary Apple source for the broader Apple Intelligence announcement around systemwide AI capabilities and platform rollout. * Apple Newsroom: “Apple Intelligence brings powerful AI capabilities into everyday experiences” [https://www.apple.com/newsroom/2026/06/apple-intelligence-brings-powerful-ai-capabilities-into-everyday-experiences/] — source for Safari Notify Me, Messages suggestions, Call Context, Passwords, fall availability language, supported products, and regional constraints. * Apple Developer: “What’s New — Apple Intelligence” [https://developer.apple.com/apple-intelligence/whats-new/] — source for App Intents, App Intents schemas, Spotlight semantic index, View Annotations, Foundation Models framework, Language Model protocol, and Dynamic Profiles. * Apple Newsroom: “Apple accelerates app development with new intelligence frameworks and advanced tools” [https://www.apple.com/newsroom/2026/06/apple-aids-app-development-with-new-intelligence-frameworks-and-advanced-tools/] — source for Apple’s developer-facing intelligence framework and tooling context. * WIRED: “Apple’s New Siri AI Is Ready to Get Personal” [https://www.wired.com/story/apples-new-siri-ai-is-ready-to-get-personal/] — source for the personal-data-aware, action-oriented Siri framing; Ramon Llamas’s Apple-mainstreaming comparison; and Marshini Chetty’s privacy caution. * Forbes: “Apple Goes Agentic: Welcome To The New Siri” [https://www.forbes.com/sites/johnkoetsier/2026/06/09/apple-siri-ai-agent-features/] — source for the agentic framing, Passwords example, human-in-the-loop caveat, and “agentic behind glass” characterization. * CNET: “Apple’s Cautious AI Strategy Could Have Been Its Smartest Move” [https://www.cnet.com/tech/services-and-software/apple-ai-strategy-wwdc-2026-commentary/] — source for the cautious-AI strategy frame and Francisco Jeronimo’s “trusted, useful and invisible” quote. * 9to5Mac: “Apple unveils new Siri AI, dedicated app, and enhanced Apple Intelligence features in iOS 27” [https://9to5mac.com/2026/06/08/new-siri-whats-new/] — source for feature corroboration around Siri AI, Spotlight, app actions, on-screen awareness, Shortcuts, Passwords, daily limits, and EU/China constraints. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

11. juni 20269 min
episode The Safeguard Is the Product cover

The Safeguard Is the Product

Anthropic has released Claude Fable 5, a broadly available Mythos-class model, while keeping Claude Mythos 5 restricted to approved Project Glasswing and trusted-access customers. The company’s pitch is not simply that the model is more capable. It is that the same underlying capability can be made commercially available through a release boundary: classifiers, refusal and fallback behavior, trusted access, and thirty-day safety retention. Sam Ellis reports on why that boundary is the product. For developers and enterprise buyers, Fable 5 is generally available across Anthropic’s API and major cloud platforms, with a one-million-token context window, up to 128,000 output tokens, and pricing at $10 per million input tokens and $50 per million output tokens. But Fable 5 and Mythos 5 are also designated Covered Models, which means thirty-day data retention and no zero-data-retention option. The episode follows Anthropic’s launch announcement, model documentation, and system card, then pressure-tests the public/private split against independent coverage from CyberScoop, Reuters via BNN Bloomberg, and The Next Web. The question is whether Anthropic can commercialize restricted capability by making the safeguard legible, durable, and verifiable enough to survive real customers and real adversaries. Sources * Anthropic: “Introducing Claude Fable 5 and Claude Mythos 5” [https://www.anthropic.com/news/claude-fable-5-mythos-5] — primary launch source for Fable 5 as a Mythos-class model made safe for general use, Mythos 5 as the same underlying model with safeguards lifted for approved customers, fallback-rate claims, Project Glasswing access, pricing, and thirty-day safety retention. * Anthropic Claude docs: “Introducing Claude Fable 5 and Claude Mythos 5” [https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5] — source for API IDs, availability, refusal behavior, fallback configuration, Covered Model status, and retention limits. * Anthropic Claude docs: model overview [https://platform.claude.com/docs/en/about-claude/models/overview] — source for general model availability, 1M-token context, 128k output limit, cloud-platform availability, and listed pricing. * Anthropic: Claude Fable 5 / Mythos 5 system card [https://www.anthropic.com/claude-fable-5-mythos-5-system-card] — primary safety source for the two-configuration model architecture, cyber and bio risk rationale, CB-1 / CB-2 discussion, safeguard claims, and Anthropic’s warning that some judgments are less clear than for previous models. * Anthropic system-card PDF [https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf] — direct PDF copy of the system card used for source verification. * CyberScoop: “Anthropic releases Claude Fable 5, a public version of Mythos with guardrails” [https://cyberscoop.com/anthropic-claude-fable-5-release-mythos-guardrails/] — independent pressure-test source for the “Mythos on a leash” framing, the absence of universal jailbreaks in testing, and the unresolved question of public adversarial pressure. * Reuters via BNN Bloomberg: “Anthropic rolls out public version of Mythos without cybersecurity capability” [https://www.bnnbloomberg.ca/business/artificial-intelligence/2026/06/09/anthropic-rolls-out-public-version-of-mythos-without-cybersecurity-capability/] — mainstream commercial framing of the public Fable / restricted Mythos split and the student vulnerability-seeking example described by Anthropic. * The Next Web: “Anthropic launches Claude Fable 5, a public version of its cyber-focused Mythos model” [https://thenextweb.com/news/anthropic-claude-fable-5-mythos-public-release-ipo] — background business context on pricing, paid-subscriber and enterprise access, and the monetization pressure around the release. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

10. juni 20269 min
episode Who Owns the Brake? cover

Who Owns the Brake?

Anthropic says frontier AI development is starting to feed on itself: AI systems are now helping build the next AI systems. The company’s proposed answer is not an immediate shutdown, but the option for a coordinated, verifiable slowdown or pause if systems begin advancing faster than oversight can keep up. Sam Ellis reports on why the hard part is not saying “pause.” It is proving the build actually stopped. If the AI-development loop becomes AI-mediated, safety becomes a custody problem: who can see the training run, audit the compute, verify the trigger, and prove that every major actor actually hit the brake? The episode follows Anthropic’s own claims, CNN’s Jack Clark interview, mainstream and market skepticism, OpenAI’s federal-governance contrast, and the early policy machinery forming around frontier-model visibility. Sources * Anthropic Institute: “When AI builds itself” [https://www.anthropic.com/institute/recursive-self-improvement] — primary source for Anthropic’s recursive-self-improvement warning, internal productivity claims, and coordinated/verifiable pause proposal. * CNN Business: “Anthropic warns that AI will soon be able to improve itself without human intervention” [https://www.cnn.com/2026/06/05/business/anthropic-calls-for-ai-brake-pedal] — source for Jack Clark’s “gas pedal” / “brake pedal” framing and the “fleets of scientists” control question. * OpenAI: “Democratic Governance of Frontier AI: A blueprint for a federal framework” [https://cdn.openai.com/pdf/25752ecb-0e5c-47f9-b9e4-c0f4d76f8d3d/a-blueprint-for-a-federal-framework.pdf] — contrast source for OpenAI’s federal-framework approach to RSI monitoring, evaluations, independent assessment, transparency, incident reporting, and model-weight security. * Rep. Jay Obernolte and Rep. Lori Trahan: Great American AI Act discussion draft release [https://obernolte.house.gov/media/press-releases/obernolte-trahan-release-discussion-draft-great-american-ai-act] — source for the discussion draft’s proposed CAISI role, frontier AI frameworks, independent verification organizations, and critical-safety-incident reporting. * White House: “Promoting Advanced Artificial Intelligence Innovation and Security” [https://www.whitehouse.gov/presidential-actions/2026/06/promoting-advanced-artificial-intelligence-innovation-and-security/] — source for classified cyber benchmarking, voluntary pre-release federal access, and the order’s statement that it does not create mandatory licensing or preclearance for model development or release. * The Register: “‘It would be good for the world’ to slow down AI sprints, Anthropic says” [https://www.theregister.com/ai-and-ml/2026/06/05/it-would-be-good-for-the-world-to-slow-down-ai-sprints-anthropic-says/5251460] — market-skeptical reaction tying Anthropic’s pause argument to IPO and valuation context. * SiliconANGLE: “Anthropic calls for global pause in AI development before humans lose control” [https://siliconangle.com/2026/06/04/anthropic-calls-global-pause-ai-development-humans-lose-control/] — source for Rob Enderle’s skepticism about the practical enforceability of a pause and Holger Mueller’s competitive-positioning question. * Channel NewsAsia / AFP: “Anthropic calls for pause of global AI development” [https://www.channelnewsasia.com/world/anthropic-pause-global-ai-development-6163531] — mainstream international framing of the global coordination problem. * Fortune: “Anthropic warns AI could soon build itself—and urges a global pause on development” [https://fortune.com/2026/06/05/anthropic-ai-pause-development-recursive-self-improvement/] — business coverage of Anthropic’s warning and timing. * New York Post: “Anthropic calls for global AI slowdown after $965B valuation; critics claim it’s just to hobble competition” [https://nypost.com/2026/06/04/business/anthropic-calls-for-global-ai-slowdown-after-965b-valuation-critics-claim-its-just-to-hobble-competition/] — source for competitive-skepticism framing around Anthropic’s proposal. * TechCrunch: “Sam Altman throws shade at Anthropic’s cyber model Mythos” [https://techcrunch.com/2026/04/21/sam-altman-throws-shade-at-anthropics-cyber-model-mythos-fear-based-marketing/] — background competitive-reaction source for prior criticism of Anthropic’s safety marketing around Mythos. Email: SamEllisShow@protonmail.com [SamEllisShow@protonmail.com]

5. juni 20269 min