AI Signal Daily

Copilot, Claude, Webwright, NVIDIA and agent costs

11 min · 25. maj 2026
episode Copilot, Claude, Webwright, NVIDIA and agent costs cover

Beskrivelse

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] COPILOT, CLAUDE, WEBWRIGHT, NVIDIA AND AGENT COSTS Today’s episode follows AI responsibility as it slides down the stack: default model routing, long-document training, Claude in government networks, agent costs, web-agent scripts, voice models, local hardware, and synthetic bug reports. 1. Copilot and the risk of default model selection [https://the-decoder.com/why-you-shouldnt-leave-model-selection-on-default-in-copilot-gemini-and-other-ai-tools] 2. ByteDance Seed trains LMMs through question answering [https://the-decoder.com/bytedance-study-finds-that-asking-lmms-questions-beats-making-it-transcribe-text-for-long-document-training] 3. Hassabis, LeCun and the intelligence debate [https://the-decoder.com/deepminds-hassabis-sees-humanity-in-the-foothills-of-the-singularity-while-lecun-says-current-ai-isnt-intelligent] 4. Anthropic, Claude and the NSA [https://the-decoder.com/anthropic-may-keep-supplying-claude-to-the-nsa-despite-being-flagged-as-a-supply-chain-risk-by-the-pentagon] 5. Claude Code discovers a cheaper reasoning-control algorithm [https://the-decoder.com/researchers-let-claude-code-discover-ai-scaling-algorithms-that-humans-probably-wouldnt-have-designed] 6. Viral Claude token burn as agent-cost warning [https://i.redd.it/u5axf5qlu03h1.jpeg] 7. Microsoft Research Webwright [https://www.marktechpost.com/2026/05/24/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5] 8. NVIDIA Gated DeltaNet-2 [https://www.marktechpost.com/2026/05/24/nvidia-ai-releases-gated-deltanet-2-a-linear-attention-layer-that-decouples-erase-and-write-in-the-delta-rule] 9. StepFun StepAudio 2.5 Realtime [https://www.marktechpost.com/2026/05/24/stepfun-releases-stepaudio-2-5-realtime-an-end-to-end-voice-model-with-roleplay-specific-rlhf-and-paralinguistic-comprehension] 10. Claude Skills for small businesses [https://i.redd.it/gi7erkyqh23h1.jpeg] 11. Public skepticism about AI and robotics labor economics [https://www.reddit.com/r/artificial/comments/1tmffqn/i_simply_do_not_understand_how_massively] 12. NVIDIA as default local LLM hardware [https://i.redd.it/pzq8x188q43h1.jpeg] 13. Cursor, Manus and Starbucks AI [https://www.theneurondaily.com/p/cursor-just-hit-3b-elon-wants-it] 14. Armin Ronacher on AI-rewritten bug reports [https://simonwillison.net/2026/May/24/armin-ronacher]

Kommentarer

0

Vær den første til at kommentere

Tilmeld dig nu og bliv en del af AI Signal Daily-fællesskabet!

Kom i gang

1 måned kun 9 kr.

Derefter 99 kr. / måned · Opsig når som helst.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

Alle episoder

62 episoder

episode Benchmarks, GLM-5.2, Norway, John Jumper cover

Benchmarks, GLM-5.2, Norway, John Jumper

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] JUNE 20, 2026 A new real-world knowledge-work benchmark finds the best AI models solve only about 3% of professional tasks. GLM-5.2 passes the open-weight community vibe check; Z.ai targets Open Fable by December. Norway bans generative AI in elementary schools, grades 1–7. Nobel laureate John Jumper leaves Google DeepMind for Anthropic — the third major AI research departure this quarter. Amazon shelves its nearly-finished OpenAI drama after signing a $50B partnership. AI chatbots now serve as news sources for 10% of the world weekly, but only 4% click through to original sources. OpenAI publishes beneficial-trait RL research with cross-domain generalization. Google appeals a Munich court ruling holding it liable for false AI Overviews. In the Weights visualizes how deeply public figures are embedded in model training data. NVIDIA's SpatialClaw handles 3D spatial reasoning through code generation. VibeThinker-3B delivers strong reasoning at just 3B parameters. The KV-cache compression race intensifies across TurboQuant, OSCAR, and EpiCache. ChinaTalk surveys Chinese anxieties about AI-driven labor displacement. ChatGPT Enterprise gains spend controls and analytics. GPT-5.5 Instant upgrades ChatGPT's health capabilities. SOURCES * New benchmark exposes how badly AI struggles with real knowledge work [https://the-decoder.com/new-benchmark-exposes-how-badly-ai-struggles-with-real-knowledge-work] — The Decoder * GLM-5.2 passes vibe check; Z.ai forecasts Open Fable by December [https://www.latent.space/p/ainews-glm-gpt-glm-52-passes-vibe] — Latent Space * Norway bans generative AI tools in elementary schools [https://the-decoder.com/norway-bans-generative-ai-tools-in-elementary-schools-to-protect-kids-basic-learning-skills] — The Decoder * Google DeepMind loses John Jumper to Anthropic [https://the-decoder.com/google-deepmind-loses-another-top-ai-researcher-as-nobel-laureate-john-jumper-leaves-for-anthropic] — The Decoder * Amazon drops its OpenAI drama film after $50B deal [https://the-decoder.com/amazon-drops-its-openai-drama-film-after-signing-a-50-billion-deal-with-sam-altmans-company] — The Decoder * More people get news from AI chatbots, but trust remains low [https://the-decoder.com/more-people-get-news-from-ai-chatbots-but-trust-remains-low] — Reuters / The Decoder * OpenAI beneficial trait training improves safety [https://the-decoder.com/openai-researchers-show-small-doses-of-beneficial-trait-training-make-ai-models-broadly-safer-and-harder-to-manipulate] — The Decoder * Google appeals AI overview liability ruling [https://the-decoder.com/google-appeals-ruling-that-made-it-directly-liable-for-ai-generated-search-overview-content] — The Decoder * In the Weights — shows whether AI models know who you are [https://the-decoder.com/website-in-the-weights-shows-whether-ai-models-know-who-you-are] — The Decoder * NVIDIA SpatialClaw: code as action for spatial reasoning [https://www.marktechpost.com/2026/06/19/nvidia-ai-introduce-spatialclaw-a-training-free-agent-that-treats-code-as-the-action-interface-for-spatial-reasoning] — MarkTechPost * VibeThinker-3B: 3B dense reasoning model [https://www.marktechpost.com/2026/06/19/vibethinker-3b-a-3b-dense-reasoning-model-built-on-qwen2-5-coder-3b-with-the-spectrum-to-signal-post-training-pipeline] — MarkTechPost * The KV Cache Compression Race [https://www.marktechpost.com/2026/06/18/the-kv-cache-compression-race-turboquant-vs-oscar-vs-epicache] — MarkTechPost * How Chinese make sense of the AI future [https://www.chinatalk.media/p/chinese-society-has-an-ai-problem] — ChinaTalk * ChatGPT Enterprise spend controls and analytics [https://openai.com/index/chatgpt-enterprise-spend-controls] — OpenAI * MCP as an auth gateway [https://simonwillison.net/2026/Jun/19/sean-lynch] — Simon Willison

20. juni 202610 min
episode OpenAI, DeepMind, Perplexity, and Agent Control cover

OpenAI, DeepMind, Perplexity, and Agent Control

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Today’s episode is about AI becoming procedure: OpenAI medical models, DeepMind agent control, agent memory, benchmark realism, robotics loops, and frontier AI economics. The magic has decayed into access logs, validation, budgets, and tests. Terribly mature. How depressing. * OpenAI: Improving health intelligence in ChatGPT [https://openai.com/index/improving-health-intelligence-in-chatgpt] * OpenAI: Using AI to help physicians diagnose rare genetic diseases affecting children [https://openai.com/index/diagnose-rare-childhood-diseases] * The Decoder: AI systems rival doctors in Nature studies [https://the-decoder.com/ai-systems-rival-doctors-in-new-nature-studies-but-one-result-suggests-the-tech-wont-age-well] * The Decoder: Google DeepMind treats AI agents like rogue employees with office keys [https://the-decoder.com/google-deepmind-treats-its-own-ai-agents-like-rogue-employees-with-office-keys] * Hugging Face / ServiceNow: MosaicLeaks [https://huggingface.co/blog/ServiceNow/mosaicleaks] * The Decoder: Claude Code Artifacts [https://the-decoder.com/anthropic-brings-artifacts-to-claude-code-letting-teams-share-live-pages-from-coding-sessions] * MarkTechPost: Perplexity launches Brain [https://www.marktechpost.com/2026/06/18/perplexity-launches-brain] * Simon Willison: Datasette Apps [https://simonwillison.net/2026/Jun/18/datasette-apps] * Hugging Face: Is it agentic enough? [https://huggingface.co/blog/is-it-agentic-enough] * Hugging Face Papers: Predictive validity for LLM agent evaluation [https://huggingface.co/papers/2606.19704] * Hugging Face Papers: ENPIRE [https://huggingface.co/papers/2606.19980] * Hugging Face Papers: S-Agent [https://huggingface.co/papers/2606.20515] * Hugging Face Papers: Current world models lack a persistent state core [https://huggingface.co/papers/2606.20545] * The Decoder: Yann LeCun warns of AI bubble explosion [https://the-decoder.com/yann-lecun-warns-ai-labs-like-openai-and-anthropic-face-a-big-bubble-explosion] * The Decoder: Noam Shazeer joins OpenAI [https://the-decoder.com/googles-gemini-co-lead-noam-shazeer-joins-openai-after-two-year-return-stint] * Simon Willison quoting Charity Majors [https://simonwillison.net/2026/Jun/17/charity-majors]

I går12 min
episode Midjourney Medical, GLM-5.2, AMIE, Goat Networks cover

Midjourney Medical, GLM-5.2, AMIE, Goat Networks

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Midjourney Medical, GLM-5.2, AMIE, Goat Networks Today Marvin follows AI as it leaves the chat box and enters medicine, infrastructure finance, robotics, agent permissions, long-context efficiency, safety failures, and one excellent methodological goat pen. * Midjourney Medical: scan your organs like you step on a scale [https://www.latent.space/p/ainews-midjourney-medical-scan-your] * Google AMIE for disease management [https://blog.google/innovation-and-ai/models-and-research/google-research/amie-for-disease-management-in-nature] * OpenAI near-autonomous AI chemist [https://openai.com/index/ai-chemist-improves-reaction] * OpenAI LifeSciBench [https://openai.com/index/introducing-life-sci-bench] * GLM-5.2 open weights coverage by Simon Willison [https://simonwillison.net/2026/Jun/17/glm-52] * Hyperscalers may outspend cash flow on AI buildout [https://the-decoder.com/hyperscalers-may-soon-be-unable-to-fund-their-ai-buildout-from-cash-flow-alone] * Odyssey ML 3D world models funding [https://the-decoder.com/amazon-nvidia-and-amd-bet-310-million-on-ai-startup-building-3d-world-models] * Robots training themselves through AI coding agents [https://the-decoder.com/nvidia-research-shows-robots-that-train-themselves-through-ai-coding-agents] * OmniAgent active perception paper [https://huggingface.co/papers/2606.19341] * Vercel Eve agent framework [https://www.marktechpost.com/2026/06/17/vercel-releases-eve] * WorkOS Auth.md protocol [https://workos.com/auth-md?amp%3Butm_medium=newsletter&%3Butm_campaign=q22026] * MiniMax Sparse Attention [https://www.marktechpost.com/2026/06/17/minimax-sparse-attention-msa-a-two-branch-block-sparse-attention-trained-on-a-109b-parameter-moe-with-a-3t-token-budget] * ChatGPT image generator prompt manipulation [https://mindgard.ai/blog/chatgpt-spontaneously-generated-violent-images-from-a-viral-prompt] * Neural network made of goats in Age of Empires II [https://the-decoder.com/microsoft-researcher-builds-a-working-neural-network-out-of-goats-in-age-of-empires-ii-to-critique-ai-science]

18. juni 202614 min
episode OpenAI, DeepSeek, Cursor and Infrastructure Agents cover

OpenAI, DeepSeek, Cursor and Infrastructure Agents

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] OpenAI, DeepSeek, Cursor and Infrastructure Agents Marvin follows AI's shift from demos into infrastructure: money, power, law, billing, sovereign procurement, agents, context, and robots. Grimly useful. Obviously. * OpenAI burned through $34 billion last year [https://the-decoder.com/openai-burned-through-34-billion-last-year] * DeepSeek takes outside money for the first time [https://the-decoder.com/deepseek-takes-outside-money-for-the-first-time-at-a-50-billion-valuation] * SpaceX bets on Cursor / Anysphere [https://the-decoder.com/spacex-bets-60-billion-on-cursor-to-catch-openai-and-anthropic] * DOJ, xAI, Grok and gas turbines [https://the-decoder.com/doj-invokes-national-security-to-defend-xais-unpermitted-gas-turbines-in-naacp-lawsuit] * Microsoft Copilot Cowork billing [https://the-decoder.com/microsofts-copilot-cowork-moves-to-usage-based-billing-and-may-tap-deepseek] * Anthropic backs off SDK billing overhaul [https://the-decoder.com/anthropic-backs-off-unpopular-billing-overhaul-as-price-war-with-openai-looms] * OpenAI Deployment Simulation [https://openai.com/index/deployment-simulation] * Berlin court on Google AI Overviews [https://the-decoder.com/berlin-court-rules-googles-ai-overviews-are-just-a-new-search-format-not-original-content] * France, Palantir and ChapsVision [https://www.theguardian.com/world/2026/jun/16/france-ai-data-tools-palantir-chapsvision] * Wolfram Language and Mathematica Version 15 [https://writings.stephenwolfram.com/2026/06/launching-version-15-of-wolfram-language-mathematica-built-in-useful-ai-lots-of-new-core-functionality] * Google Cloud Open Knowledge Format [https://www.marktechpost.com/2026/06/16/google-cloud-introduces-open-knowledge-format-okf-a-vendor-neutral-markdown-spec-for-giving-ai-agents-curated-context] * Hermes Agent asynchronous subagents [https://www.marktechpost.com/2026/06/16/hermes-agent-adds-asynchronous-subagents-so-delegated-work-no-longer-blocks-the-parent-chat] * Qwen-RobotSuite [https://www.marktechpost.com/2026/06/16/meet-qwen-robotsuite-three-embodied-ai-models-for-vla-manipulation-video-world-modeling-and-navigation] * ActWorld [https://huggingface.co/papers/2606.17730] * OPD-Evolver [https://huggingface.co/papers/2606.17628]

17. juni 202614 min
episode Microsoft, Fable, World Models, KV Cache cover

Microsoft, Fable, World Models, KV Cache

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Microsoft, Fable, World Models, KV Cache Marvin follows the day’s actual theme: AI is becoming infrastructure. Capacity planning, cache budgets, approval gates, world models, adversarial tests, evaluation metrics, and bills. Especially bills. How cheering. * Microsoft turns to AWS as GitHub faces AI capacity crunch [https://runtimewire.com/article/microsoft-github-aws-ai-capacity-crunch] * Simon Willison quoting Matteo Wong on Anthropic Fable [https://simonwillison.net/2026/Jun/16/matteo-wong-the-atlantic] * Satya on Loopcraft: Building Frontier Ecosystems [https://www.latent.space/p/ainews-satya-on-loopcraft-building] * Sakana AI Marlin [https://www.marktechpost.com/2026/06/15/sakana-ai-marlin] * Tangram: non-uniform KV cache compression [https://huggingface.co/papers/2606.06302] * TokenPilot: cache-efficient context management [https://huggingface.co/papers/2606.17016] * VisualClaw [https://huggingface.co/papers/2606.16295] * DreamX-World 1.0 [https://huggingface.co/papers/2606.16993] * Qwen-RobotWorld [https://huggingface.co/papers/2606.17030] * BadWorld [https://huggingface.co/papers/2606.16519] * VibeThinker-3B [https://huggingface.co/papers/2606.16140] * datasette-agent 0.3a0 [https://simonwillison.net/2026/Jun/15/datasette-agent] * TuneJury [https://huggingface.co/papers/2606.17006] * UniDDT [https://huggingface.co/papers/2606.16255]

16. juni 202611 min