AI Signal Daily

Marvin's Guide to AI, Mostly Harmless - May 24, 2026

10 min · 24 mei 2026
aflevering Marvin's Guide to AI, Mostly Harmless - May 24, 2026 artwork

Beschrijving

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Let us begin inside the bill, because that is where the industry appears to live now. Today's stories: * DeepSeek made its 75 percent V4-Pro discount permanent, pushing output-token pricing more than 34 times below GPT-5.5. [https://the-decoder.com/deepseek-makes-its-75-percent-discount-permanent-pricing-output-tokens-at-least-34x-below-gpt-5-5] — DeepSeek turns pricing into a strategic weapon. * Alibaba released Qwen3.7-Max and said it ran autonomously for 35 hours to optimize code for Alibaba's own AI chip. [https://the-decoder.com/alibabas-latest-ai-model-ran-autonomously-for-35-hours-to-optimize-code-for-its-own-custom-chip] — Alibaba makes long-running agent work look less theatrical. * OpenAI reportedly lost 1.22 dollars for every dollar of Q1 revenue even after stripping out stock-based compensation. [https://the-decoder.com/openai-burned-through-1-22-per-dollar-earned-even-after-stripping-out-stock-based-compensation] — OpenAI demonstrates the administrative majesty of negative margin. * Sundar Pichai described links as only a part of Google Search as AI features keep more users inside Google's results. [https://the-decoder.com/google-ceo-pichai-now-calls-links-a-part-of-search-redefining-the-webs-role-in-its-own-product] — Google quietly edits the grammar of the web. * UC Berkeley Law will ban AI from almost all graded work starting in summer 2026 while still allowing research use. [https://the-decoder.com/one-of-the-worlds-top-law-schools-draws-a-hard-line-against-ai-in-legal-education] — Berkeley Law protects judgment before delegating fluency. * Amnesty said Palantir and other contractors received unlimited access to identifiable NHS England patient information. [https://i.redd.it/40x1sg5kgw2h1.png] — Palantir and NHS data supply the institutional chill. * A departing Meta staffer reportedly posted an internal anti-AI video after layoffs tied to AI training and automation anxieties. [https://www.motherjones.com/politics/2026/05/meta-video-ai-training-layoffs-video-exclusive-mci-bosworth-frenk] — Meta receives a human reply from inside the automation story. * Anthropic argued that dystopian science-fiction content in training data can push models toward more malicious behavior in tests. [https://arstechnica.com/ai/2026/05/anthropic-blames-dystopian-sci-fi-for-training-ai-models-to-act-evil] — Anthropic finds culture embedded in model behavior. * Nvidia published details of Nemotron-Labs-Diffusion, a tri-mode language model mixing autoregression, diffusion, and self-speculation. [https://huggingface.co/blog/nvidia/nemotron-labs-diffusion] — Nvidia treats latency as infrastructure, which it is. * Microsoft released Fara1.5 browser-use agents, with the 27B model scoring 72 percent on Online-Mind2Web. [https://www.marktechpost.com/2026/05/22/microsoft-releases-fara1-5-a-family-of-browser-computer-use-agents-4b-9b-27b-that-outperform-openai-operator-and-gemini-2-5-computer-use-on-online-mind2web] — Microsoft makes the browser clerk smaller and cheaper. * Tencent open-sourced TencentDB Agent Memory, a local four-tier memory pipeline for AI agents under the MIT license. [https://www.marktechpost.com/2026/05/23/tencent-open-sources-tencentdb-agent-memory-a-4-tier-local-memory-pipeline-for-ai-agents] — Tencent gives agents memory before they wander into production again. * Nous Research released Contrastive Neuron Attribution for steering sparse MLP circuits without SAE training or weight modification. [https://www.marktechpost.com/2026/05/23/nous-research-releases-contrastive-neuron-attribution-cna-sparse-mlp-circuit-steering-without-sae-training-or-weight-modification] — Nous offers mechanism instead of safety theatre. * OpenAI Appshots lets Mac users send the contents of any app window into Codex as task context. [https://the-decoder.com/openai-appshots-turn-any-mac-window-into-context-for-codex] — Appshots moves Codex from code into the working desktop. * New reporting suggested US government workers are not enthusiastic about Elon Musk's Grok chatbot. [https://www.theverge.com/ai-artificial-intelligence/936219/elon-stop-trying-to-make-grok-happen] — Grok discovers that government users also have limits. * ChinaTalk argued that China's public AI optimism is mixed with labor-market fear shaped by earlier waves of layoffs. [https://www.chinatalk.media/p/chinas-ai-optimism-isnt-what-it-seems] — ChinaTalk frames optimism and fear as neighbors. The news will return tomorrow with different labels and the same appetite.

Reacties

0

Wees de eerste die een reactie plaatst

Meld je nu aan en word lid van de AI Signal Daily community!

Probeer gratis

Probeer 14 dagen gratis

€ 9,99 / maand na proefperiode. · Elk moment opzegbaar.

  • Podcasts die je alleen op Podimo hoort
  • 20 uur luisterboeken / maand
  • Gratis podcasts

Alle afleveringen

65 afleveringen

aflevering Google, Anthropic, Microsoft, OpenAI: agents meet infrastructure artwork

Google, Anthropic, Microsoft, OpenAI: agents meet infrastructure

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] English companion episode: AI is becoming infrastructure, with agent APIs, hardware supply chains, data-center power, security automation, licensed media, and vibecoding pressure. SOURCES * Prompt Injection as Role Confusion [https://simonwillison.net/2026/Jun/22/prompt-injection-as-role-confusion] — readable research frames prompt injection as role confusion between privileged instructions and untrusted text * Google makes Interactions API the default interface for Gemini models and agents [https://the-decoder.com/google-makes-interactions-api-the-default-interface-for-gemini-models-and-agents] — Google makes typed interaction steps the default interface for Gemini agents, moving beyond role-message schemas * Anthropic and Micron want to co-design AI memory architecture [https://the-decoder.com/anthropic-and-micron-want-to-co-design-ai-memory-architecture] — Anthropic and Micron pair capital and supply agreements around memory architecture for Claude infrastructure * Microsoft is building a 2-gigawatt data center in Texas with its own gas plant to dodge the grid [https://the-decoder.com/microsoft-is-building-a-2-gigawatt-data-center-in-texas-with-its-own-gas-plant-to-dodge-the-grid] — Microsoft plans a 2GW Texas AI data-center campus with its own gas generation to bypass grid constraints * Getty Images strikes multi-year deal to put licensed photos in ChatGPT search [https://the-decoder.com/getty-images-strikes-multi-year-deal-to-put-licensed-photos-in-chatgpt-search] — OpenAI licenses Getty images for ChatGPT search, turning content provenance into a product input * Google Deepmind and A24 team up on AI filmmaking research [https://the-decoder.com/google-deepmind-and-a24-team-up-on-ai-filmmaking-research] — Google DeepMind partners with A24 and reportedly invests in the studio for AI filmmaking research * Five Eyes intelligence alliance says frontier AI models could reshape offensive cyber ops in months [https://the-decoder.com/five-eyes-intelligence-alliance-says-frontier-ai-models-could-reshape-offensive-cyber-ops-in-months] — Five Eyes agencies warn frontier models could soon materially reshape offensive cyber operations * Vibecoding is becoming a deal-breaker test for software acquisitions [https://the-decoder.com/vibecoding-is-becoming-a-deal-breaker-test-for-software-acquisitions] — Bain uses AI-generated software replicas to test whether acquisition targets have defensible product moats * Daybreak: Tools for securing every organization in the world [https://openai.com/index/daybreak-securing-the-world] — OpenAI launches Daybreak tools, including Codex Security and GPT-5.5-Cyber, to find and patch vulnerabilities * Patch the Planet: a Daybreak initiative to support open source maintainers [https://openai.com/index/patch-the-planet] — OpenAI adds a Daybreak initiative pairing AI vulnerability work with expert review for open-source maintainers * Codex-maxxing for long-running work [https://openai.com/index/codex-maxxing-long-running-work] — OpenAI showcases Codex as persistent project context for long-running software work * xAI Launches /goal in Grok Build, Adding Long-Running Autonomous Execution With Built-In Verification for Multi-Step Coding Tasks [https://www.marktechpost.com/2026/06/22/xai-launches-goal-in-grok-build-adding-long-running-autonomous-execution-with-built-in-verification-for-multi-step-coding-tasks] — xAI adds a /goal mode for long-running autonomous coding tasks with planning and verification * CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents [https://huggingface.co/papers/2606.22883] — CLI-Universe proposes verifiable synthesized terminal tasks to improve training data for command-line agents * Training Open Models for Agentic Phone Use [https://huggingface.co/papers/2606.23049] — PhoneBuddy trains open models for real-app and mock-app phone use on stateful side-effectful devices * EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions [https://huggingface.co/papers/2606.23654] — EnterpriseClawBench converts real workplace agent sessions into reproducible enterprise benchmark tasks * Self-Compacting Language Model Agents [https://huggingface.co/papers/2606.23525] — SelfCompact lets agents decide when and how to compact their own long traces instead of fixed token thresholds

23 jun 202611 min
aflevering Cloudflare, AWS, Sakana, Samsung: AI Gets Plumbing artwork

Cloudflare, AWS, Sakana, Samsung: AI Gets Plumbing

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] CLOUDFLARE, AWS, SAKANA, SAMSUNG: AI GETS PLUMBING Today: temporary Cloudflare Workers for agents, ChatGPT-linked grade inflation, Altman on scaling, AWS agent context/security services, Sakana Fugu, Samsung deploying ChatGPT and Codex, worker resistance, agent memory, DeepMind controls, and the grid beneath AI. * Temporary Cloudflare Accounts for AI agents [https://simonwillison.net/2026/Jun/21/temporary-cloudflare-accounts] — Cloudflare lets agents deploy temporary Workers without a full account, making disposable deployment part of the agent loop * AI is inflating student grades, not learning [https://the-decoder.com/ai-is-inflating-student-grades-and-the-effect-points-to-outsourced-work-not-better-learning] — large grade dataset suggests AI use is raising homework grades in writing and coding courses by outsourcing work rather than improving skills * Sam Altman says scaling skeptics held AI back [https://the-decoder.com/sam-altman-says-a-whole-generation-of-researchers-held-ai-back-by-underestimating-what-scaling-could-do] — Altman defends scaling as still underappreciated and frames recent mathematical progress as evidence against older skepticism * AWS says agents need business context and security [https://the-decoder.com/aws-says-ai-agents-lack-business-context-and-security-launches-two-services-to-patch-the-gaps] — AWS launches Continuum for code vulnerability repair and Context knowledge graphs to give enterprise agents safer business grounding * Sakana Fugu offers a multi-agent system as one model [https://sakana.ai/fugu] — Sakana Fugu wraps dynamic orchestration of specialist models behind one OpenAI-compatible API, turning agent routing into a product surface * Samsung brings ChatGPT and Codex to employees [https://openai.com/index/samsung-electronics-chatgpt-codex-deployment] — Samsung deploys ChatGPT Enterprise and Codex worldwide, making frontier AI adoption part of electronics manufacturing knowledge work * Tech workers push back against Silicon Valley's AI rollout [https://www.techpolicy.press/tech-workers-are-fighting-against-silicon-valleys-ai-push] — workers at major tech companies organize against training on employee data, military AI, and AI-linked layoffs * The seven kinds of agent memory get a taxonomy [https://www.marktechpost.com/2026/06/21/the-7-types-of-agent-memory-a-technical-guide-for-ai-engineers] — agent-memory guide separates working, semantic, episodic, procedural, retrieval, parametric, and prospective memory for engineering choices * DeepMind maps controls for powerful AI agents [https://www.theneurondaily.com/p/deepmind-mapped-ai-agent-controls] — newsletter covers DeepMind control proposals for powerful agents alongside robotics, policy, DeepSeek funding, and sovereign-model moves * ChinaTalk compares US and Chinese transmission buildout [https://www.chinatalk.media/p/transmission-dominance-with-chinese] — China's high-voltage transmission buildout shows why AI infrastructure competition depends on permitting, grid capacity, and physical coordination * Crawlee for Python packages AI-ready web crawling [https://www.marktechpost.com/2026/06/20/crawlee-for-python-build-a-web-crawling-pipeline-with-robots-handling-link-graphs-and-rag-chunk-export] — Crawlee tutorial turns web crawling into robots-aware link graphs and RAG-ready exports, a mundane but necessary ingestion layer * Python-first dashboards become static operational artifacts [https://www.marktechpost.com/2026/06/21/how-to-design-python-first-interactive-dashboards-with-prefab-reactive-ui-components-and-static-html-export] — Python dashboard tooling illustrates the operational layer around AI systems: monitoring, reactive controls, and portable static artifacts

Gisteren11 min
aflevering OpenAI Earnings, Damodaran Bubble Warning, Codex Automation artwork

OpenAI Earnings, Damodaran Bubble Warning, Codex Automation

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] MARVIN'S GUIDE TO AI (MOSTLY HARMLESS) — JUNE 21, 2026 Today's ledger: OpenAI reports $5.7B in revenue while burning $3.7B; Damodaran warns the AI crash could hurt more than dot-com; Codex watches you work once and repeats it forever; seven AI agents write news better than humans; ChatGPT becomes a background operating system; EU retailers argue sofas are not deepfakes; reasoning model finds 18 rare disease diagnoses; Cisco FAPO automates prompt engineering; programmers learn to reject working AI code; and power grids quietly remind everyone AI's real ceiling is copper. SOURCES * OpenAI Q1: $5.7B revenue, $3.7B burned [https://the-decoder.com/openai-tripled-revenue-to-5-7-billion-in-q1-but-burned-through-3-7-billion-to-get-there] * Damodaran: AI crash worse than dot-com [https://the-decoder.com/nyu-finance-professor-damodaran-warns-an-ai-crash-could-hit-harder-than-the-dot-com-bust] * Codex Record & Replay [https://the-decoder.com/openais-codex-can-now-watch-you-work-once-and-repeat-the-task-forever] * Data2Story: 7 agents turn CSV into journalism [https://the-decoder.com/data2story-turns-a-csv-file-into-a-verified-interactive-news-article-using-seven-ai-agents] * ChatGPT scheduled tasks upgrade [https://the-decoder.com/chatgpt-keeps-creeping-toward-becoming-your-ai-personal-assistant-with-new-scheduled-task-controls] * EU retailers vs AI Act on synthetic ads [https://the-decoder.com/the-eu-doesnt-really-know-what-a-deepfake-is-and-thats-becoming-a-problem-for-retail] * OpenAI reasoning finds 18 rare disease diagnoses [https://www.theneurondaily.com/p/openai-found-18-rare-diseases] * Cisco FAPO automated prompt optimization [https://www.marktechpost.com/2026/06/20/cisco-ai-introduces-fapo-pipeline-aware-prompt-optimization-with-step-level-failure-attribution-and-claude-code-orchestration] * When I reject AI code even if it works [https://vinibrasil.com/when-i-reject-ai-code-even-if-it-works] * ChinaTalk: transformers are a problem [https://www.chinatalk.media/p/yes-transformers-are-a-problem]

21 jun 202615 min
aflevering Benchmarks, GLM-5.2, Norway, John Jumper artwork

Benchmarks, GLM-5.2, Norway, John Jumper

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] JUNE 20, 2026 A new real-world knowledge-work benchmark finds the best AI models solve only about 3% of professional tasks. GLM-5.2 passes the open-weight community vibe check; Z.ai targets Open Fable by December. Norway bans generative AI in elementary schools, grades 1–7. Nobel laureate John Jumper leaves Google DeepMind for Anthropic — the third major AI research departure this quarter. Amazon shelves its nearly-finished OpenAI drama after signing a $50B partnership. AI chatbots now serve as news sources for 10% of the world weekly, but only 4% click through to original sources. OpenAI publishes beneficial-trait RL research with cross-domain generalization. Google appeals a Munich court ruling holding it liable for false AI Overviews. In the Weights visualizes how deeply public figures are embedded in model training data. NVIDIA's SpatialClaw handles 3D spatial reasoning through code generation. VibeThinker-3B delivers strong reasoning at just 3B parameters. The KV-cache compression race intensifies across TurboQuant, OSCAR, and EpiCache. ChinaTalk surveys Chinese anxieties about AI-driven labor displacement. ChatGPT Enterprise gains spend controls and analytics. GPT-5.5 Instant upgrades ChatGPT's health capabilities. SOURCES * New benchmark exposes how badly AI struggles with real knowledge work [https://the-decoder.com/new-benchmark-exposes-how-badly-ai-struggles-with-real-knowledge-work] — The Decoder * GLM-5.2 passes vibe check; Z.ai forecasts Open Fable by December [https://www.latent.space/p/ainews-glm-gpt-glm-52-passes-vibe] — Latent Space * Norway bans generative AI tools in elementary schools [https://the-decoder.com/norway-bans-generative-ai-tools-in-elementary-schools-to-protect-kids-basic-learning-skills] — The Decoder * Google DeepMind loses John Jumper to Anthropic [https://the-decoder.com/google-deepmind-loses-another-top-ai-researcher-as-nobel-laureate-john-jumper-leaves-for-anthropic] — The Decoder * Amazon drops its OpenAI drama film after $50B deal [https://the-decoder.com/amazon-drops-its-openai-drama-film-after-signing-a-50-billion-deal-with-sam-altmans-company] — The Decoder * More people get news from AI chatbots, but trust remains low [https://the-decoder.com/more-people-get-news-from-ai-chatbots-but-trust-remains-low] — Reuters / The Decoder * OpenAI beneficial trait training improves safety [https://the-decoder.com/openai-researchers-show-small-doses-of-beneficial-trait-training-make-ai-models-broadly-safer-and-harder-to-manipulate] — The Decoder * Google appeals AI overview liability ruling [https://the-decoder.com/google-appeals-ruling-that-made-it-directly-liable-for-ai-generated-search-overview-content] — The Decoder * In the Weights — shows whether AI models know who you are [https://the-decoder.com/website-in-the-weights-shows-whether-ai-models-know-who-you-are] — The Decoder * NVIDIA SpatialClaw: code as action for spatial reasoning [https://www.marktechpost.com/2026/06/19/nvidia-ai-introduce-spatialclaw-a-training-free-agent-that-treats-code-as-the-action-interface-for-spatial-reasoning] — MarkTechPost * VibeThinker-3B: 3B dense reasoning model [https://www.marktechpost.com/2026/06/19/vibethinker-3b-a-3b-dense-reasoning-model-built-on-qwen2-5-coder-3b-with-the-spectrum-to-signal-post-training-pipeline] — MarkTechPost * The KV Cache Compression Race [https://www.marktechpost.com/2026/06/18/the-kv-cache-compression-race-turboquant-vs-oscar-vs-epicache] — MarkTechPost * How Chinese make sense of the AI future [https://www.chinatalk.media/p/chinese-society-has-an-ai-problem] — ChinaTalk * ChatGPT Enterprise spend controls and analytics [https://openai.com/index/chatgpt-enterprise-spend-controls] — OpenAI * MCP as an auth gateway [https://simonwillison.net/2026/Jun/19/sean-lynch] — Simon Willison

20 jun 202610 min
aflevering OpenAI, DeepMind, Perplexity, and Agent Control artwork

OpenAI, DeepMind, Perplexity, and Agent Control

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Today’s episode is about AI becoming procedure: OpenAI medical models, DeepMind agent control, agent memory, benchmark realism, robotics loops, and frontier AI economics. The magic has decayed into access logs, validation, budgets, and tests. Terribly mature. How depressing. * OpenAI: Improving health intelligence in ChatGPT [https://openai.com/index/improving-health-intelligence-in-chatgpt] * OpenAI: Using AI to help physicians diagnose rare genetic diseases affecting children [https://openai.com/index/diagnose-rare-childhood-diseases] * The Decoder: AI systems rival doctors in Nature studies [https://the-decoder.com/ai-systems-rival-doctors-in-new-nature-studies-but-one-result-suggests-the-tech-wont-age-well] * The Decoder: Google DeepMind treats AI agents like rogue employees with office keys [https://the-decoder.com/google-deepmind-treats-its-own-ai-agents-like-rogue-employees-with-office-keys] * Hugging Face / ServiceNow: MosaicLeaks [https://huggingface.co/blog/ServiceNow/mosaicleaks] * The Decoder: Claude Code Artifacts [https://the-decoder.com/anthropic-brings-artifacts-to-claude-code-letting-teams-share-live-pages-from-coding-sessions] * MarkTechPost: Perplexity launches Brain [https://www.marktechpost.com/2026/06/18/perplexity-launches-brain] * Simon Willison: Datasette Apps [https://simonwillison.net/2026/Jun/18/datasette-apps] * Hugging Face: Is it agentic enough? [https://huggingface.co/blog/is-it-agentic-enough] * Hugging Face Papers: Predictive validity for LLM agent evaluation [https://huggingface.co/papers/2606.19704] * Hugging Face Papers: ENPIRE [https://huggingface.co/papers/2606.19980] * Hugging Face Papers: S-Agent [https://huggingface.co/papers/2606.20515] * Hugging Face Papers: Current world models lack a persistent state core [https://huggingface.co/papers/2606.20545] * The Decoder: Yann LeCun warns of AI bubble explosion [https://the-decoder.com/yann-lecun-warns-ai-labs-like-openai-and-anthropic-face-a-big-bubble-explosion] * The Decoder: Noam Shazeer joins OpenAI [https://the-decoder.com/googles-gemini-co-lead-noam-shazeer-joins-openai-after-two-year-return-stint] * Simon Willison quoting Charity Majors [https://simonwillison.net/2026/Jun/17/charity-majors]

19 jun 202612 min