AI Signal Daily

Copilot, Claude, Webwright, NVIDIA and agent costs

11 min · 25. Mai 2026
Episode Copilot, Claude, Webwright, NVIDIA and agent costs Cover

Beschreibung

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] COPILOT, CLAUDE, WEBWRIGHT, NVIDIA AND AGENT COSTS Today’s episode follows AI responsibility as it slides down the stack: default model routing, long-document training, Claude in government networks, agent costs, web-agent scripts, voice models, local hardware, and synthetic bug reports. 1. Copilot and the risk of default model selection [https://the-decoder.com/why-you-shouldnt-leave-model-selection-on-default-in-copilot-gemini-and-other-ai-tools] 2. ByteDance Seed trains LMMs through question answering [https://the-decoder.com/bytedance-study-finds-that-asking-lmms-questions-beats-making-it-transcribe-text-for-long-document-training] 3. Hassabis, LeCun and the intelligence debate [https://the-decoder.com/deepminds-hassabis-sees-humanity-in-the-foothills-of-the-singularity-while-lecun-says-current-ai-isnt-intelligent] 4. Anthropic, Claude and the NSA [https://the-decoder.com/anthropic-may-keep-supplying-claude-to-the-nsa-despite-being-flagged-as-a-supply-chain-risk-by-the-pentagon] 5. Claude Code discovers a cheaper reasoning-control algorithm [https://the-decoder.com/researchers-let-claude-code-discover-ai-scaling-algorithms-that-humans-probably-wouldnt-have-designed] 6. Viral Claude token burn as agent-cost warning [https://i.redd.it/u5axf5qlu03h1.jpeg] 7. Microsoft Research Webwright [https://www.marktechpost.com/2026/05/24/microsoft-research-releases-webwright-a-terminal-native-web-agent-framework-that-scores-60-1-on-odysseys-up-from-base-gpt-5-4s-33-5] 8. NVIDIA Gated DeltaNet-2 [https://www.marktechpost.com/2026/05/24/nvidia-ai-releases-gated-deltanet-2-a-linear-attention-layer-that-decouples-erase-and-write-in-the-delta-rule] 9. StepFun StepAudio 2.5 Realtime [https://www.marktechpost.com/2026/05/24/stepfun-releases-stepaudio-2-5-realtime-an-end-to-end-voice-model-with-roleplay-specific-rlhf-and-paralinguistic-comprehension] 10. Claude Skills for small businesses [https://i.redd.it/gi7erkyqh23h1.jpeg] 11. Public skepticism about AI and robotics labor economics [https://www.reddit.com/r/artificial/comments/1tmffqn/i_simply_do_not_understand_how_massively] 12. NVIDIA as default local LLM hardware [https://i.redd.it/pzq8x188q43h1.jpeg] 13. Cursor, Manus and Starbucks AI [https://www.theneurondaily.com/p/cursor-just-hit-3b-elon-wants-it] 14. Armin Ronacher on AI-rewritten bug reports [https://simonwillison.net/2026/May/24/armin-ronacher]

Kommentare

0

Sei die erste Person, die kommentiert

Melde dich jetzt an und werde Teil der AI Signal Daily-Community!

Loslegen

2 Monate für 1 €

Dann 4,99 € / Monat · Jederzeit kündbar.

  • Podcasts nur bei Podimo
  • 20 Stunden Hörbücher / Monat
  • Alle kostenlosen Podcasts

Alle Folgen

54 Folgen

Episode Prometheus, Claude Fable 5, Anthropic, Amodei Cover

Prometheus, Claude Fable 5, Anthropic, Amodei

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] EPISODE — JUNE 12, 2026 Jeff Bezos' Prometheus raises $12B at $41B valuation with zero products. OpenAI acquires Ona for persistent Codex cloud. Dario Amodei publishes Cold War doctrine for AI. Claude Fable 5 proves "relentlessly proactive" in hands-on tests. Anthropic admits "wrong tradeoff" on researcher surveillance. Perplexity routes research across 20+ frontier models. xAI launches plugin marketplace with commit verification. Nous Research ships Hermes Agent Profile Builder. OpenAI and Anthropic prepare pre-IPO token price war. MiniMax teaches model to prove theorems with self-verification. STORIES * Jeff Bezos' Prometheus closes $12B round [https://the-decoder.com/jeff-bezos-ai-startup-prometheus-closes-12-billion-round-at-a-41-billion-valuation] * Claude Fable is relentlessly proactive [https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive] * Anthropic admits 'wrong tradeoff' [https://the-decoder.com/claude-fable-5-anthropic-admits-wrong-tradeoff-after-invisibly-throttling-rival-ai-researchers] * Dario Amodei's Cold War playbook [https://the-decoder.com/dario-amodeis-new-essay-reads-like-a-cold-war-playbook-for-the-ai-age] * OpenAI to acquire Ona [https://openai.com/index/openai-to-acquire-ona] * Perplexity Deep Research in Computer [https://www.marktechpost.com/2026/06/11/perplexity-moves-deep-research-into-computer-routing-research-subtasks-across-20-frontier-models-for-reports-decks-and-dashboards] * xAI Grok Build Plugin Marketplace [https://www.marktechpost.com/2026/06/11/xai-ships-grok-build-plugin-marketplace-with-mongodb-vercel-sentry-chrome-devtools-cloudflare-and-superpowers-plugins-at-launch] * Nous Research Hermes Agent Profile Builder [https://www.marktechpost.com/2026/06/11/nous-research-ships-hermes-agent-profile-builder-identity-model-skills-and-mcp-servers-in-one-dashboard-flow] * OpenAI vs. Anthropic: price war [https://the-decoder.com/openai-vs-anthropic-a-price-war-over-api-tokens-is-brewing] * MaxProof: mathematical proof with generative-verifier RL [https://huggingface.co/papers/2606.13473]

12. Juni 202614 min
Episode Claude Fable 5, Google AI Overviews, SpaceX, ChatGPT Cover

Claude Fable 5, Google AI Overviews, SpaceX, ChatGPT

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] CLAUDE FABLE 5 / MYTHOS 5 — SMARTER, SAFER, AND SILENTLY REFUSING Anthropic released Claude Fable 5 and Mythos 5. Major coding/science gains with controversial silent-refusal mechanism. Simon Willison: "If Claude Fable stops helping you, you'll never know." * The Decoder: Claude Fable 5 release [https://the-decoder.com/anthropic-releases-claude-fable-5-and-mythos-5-with-major-gains-in-coding-and-science] * Interconnects: Fable 5 safety analysis [https://www.interconnects.ai/p/claude-fable-5-and-new-ai-safety] * Simon Willison: First impressions [https://simonwillison.net/2026/Jun/9/claude-fable-5] * Simon Willison: Silent refusal [https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you] LANDMARK GERMAN RULING: GOOGLE LIABLE FOR AI OVERVIEWS German court declares Google responsible for false answers in AI Overviews — AI outputs are the company's own words. Precedent for the entire generative AI industry. * The Decoder: German AI Overviews ruling [https://the-decoder.com/landmark-german-ruling-declares-googles-ai-overviews-are-googles-own-words-and-makes-it-liable-f] SPACEX: FIRST AI SATELLITE AND ORBITAL DATA CENTERS SpaceX revealed its first AI satellite design and plans for orbital computing. Musk: "no big deal." Physics disagrees, but directionally interesting. * The Decoder: SpaceX orbital plans [https://the-decoder.com/spacex-wants-to-put-data-centers-in-orbit-and-musk-says-its-no-big-deal] CHATGPT COMPLETE REDESIGN OpenAI preparing a fundamental interface change for ChatGPT — from a chat interface toward something between an OS and a dashboard. * Neuron Daily: ChatGPT redesign [https://www.theneurondaily.com/p/chatgpt-is-about-to-look-completely-different] CHINA: $295B AI BUILDOUT, 80% DOMESTIC CHIPS Beijing announces massive AI infrastructure plan requiring 80% domestic semiconductors, locking out US suppliers. * The Decoder: China chip plan [https://the-decoder.com/beijings-295-billion-ai-buildout-would-require-80-percent-domestic-chips-locking-out-us-supplier] APPLE REBUILT SIRI ON GOOGLE GEMINI WWDC 2026: Apple's Siri now runs on Google Gemini with NVIDIA inference through Private Cloud Compute. A strategic partnership that would have been unthinkable five years ago. * Neuron Daily: Apple rebuilt Siri [https://www.theneurondaily.com/p/apple-finally-rebuilt-siri] * Simon Willison: Siri AI at WWDC [https://simonwillison.net/2026/Jun/8/wwdc] GOOGLE GEMINI 3.5 LIVE TRANSLATE Streaming speech-to-speech translation for 70+ languages with minimal latency through Meet and other platforms. * The Decoder: Live Translate [https://the-decoder.com/googles-gemini-3-5-live-translate-delivers-real-time-voice-translation-across-70-languages] FRONTIERCODE: CODE QUALITY BENCHMARK New benchmark from Latent Space evaluates generated code on compilability, test pass rate, and maintainability — not just token volume. * Latent Space: FrontierCode [https://www.latent.space/p/ainews-frontiercode-benchmarking] AI AGENTS: 26 MIN VS 33 SEC (47X GAP) Harvard and Perplexity study finds AI agents autonomously work 47x longer per session than humans — but persistence is not efficiency. * MarkTechPost: Harvard/Perplexity study [https://www.marktechpost.com/2026/06/08/a-new-study-from-harvard-and-perplexity-finds-ai-agents-perform-26-minutes-of-au] ATTENTION AMNESIA: COT BREAKS MEMORY Hugging Face paper demonstrates Chain-of-Thought fine-tuning improves reasoning at the cost of long-range context retention. * HF Paper: Attention Amnesia [https://huggingface.co/papers/2606.11052]

Gestern13 min
Episode Anthropic Exploit, OpenAI IPO Delay, DiffusionGemma Cover

Anthropic Exploit, OpenAI IPO Delay, DiffusionGemma

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] MARVIN'S GUIDE TO AI: MOSTLY HARMLESS — 2026-06-11 (EN) Thursday, June 11th. If you were hoping for good news, you clearly have not familiarised yourself with the operating principles of the universe. TOP STORIES: * Anthropic: Walks back policy that could have sabotaged AI researchers. Mythos Preview builds zero-day exploits from security patches in hours, before auto-updates reach devices. * OpenAI: IPO slips — Altman says "within the next year," possibly 2027. 10-gigawatt Ohio data center with Nvidia financial backing. * Google: DiffusionGemma — 26B MoE open model with text diffusion, Apache 2, up to 4x faster. NotebookLM gets code execution and agent-based research. * Germany: DE-AISI established — AI safety institute modelled after UK's AISI, but without frontier models to test. * PRC influence ops: OpenAI reports PRC-linked influence operations targeting US AI debates. * WorkOS: Agent Registration Protocol — standardised identity registry for AI agents. * Paul Kennedy: Historical perspective on US-China AI competition. ORIGINAL ARTICLES: * Anthropic walks back policy [https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy] * Anthropic exploit study [https://the-decoder.com/anthropic-study-shows-ai-needs-hours-not-weeks-to-build-exploits-from-security-patches] * OpenAI IPO [https://the-decoder.com/openais-ipo-slips-as-altman-tells-staff-to-expect-a-publ] * OpenAI data center [https://the-decoder.com/openai-wants-its-biggest-data-center-yet-and-nvidia-woul] * DiffusionGemma [https://the-decoder.com/googles-new-open-model-diffusiongemma-generates-text-fro] * NotebookLM upgrade [https://the-decoder.com/googles-notebooklm-now-runs-its-own-cloud-computer-with-] * DE-AISI [https://the-decoder.com/germanys-national-security-council-greenights-an-ai-safe] * PRC influence ops [https://openai.com/index/prc-linked-influence-operations-ai-debates] * Paul Kennedy on Great Powers [https://www.chinatalk.media/p/paul-kennedy-on-great-powers-past]

Gestern10 min
Episode OpenAI S-1, Apple Siri AI, Intel 3M Chips, Xiaomi 1T tok/s Cover

OpenAI S-1, Apple Siri AI, Intel 3M Chips, Xiaomi 1T tok/s

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Tuesday, June 9th. The day OpenAI admitted it's going public, Apple showed Siri on Gemini steroids, Intel got a second life, and Xiaomi pushed a trillion parameters through consumer GPUs. The usual: fun, sad, and completely hopeless. IN THIS EPISODE: * OpenAI files S-1: Confidential IPO filing. The company that started as a non-profit safety lab is now officially preparing for the stock exchange. Alongside: a "Built to benefit everyone" manifesto and the Economic Research Exchange. Pre-IPO positioning at its finest. * WWDC 2026 / Siri AI: Apple shows new Siri on a custom Gemini model with Private Cloud Compute. Vision LLMs for screen analysis. Technically impressive. Practically — "I'll believe it when I see it." Skepticism included free of charge. * Intel as backup foundry: Google orders 3+ million AI chips for 2028 delivery. Nvidia tests Intel for Feynman architecture. TSMC can't keep up. Supply chains decide everything. * Microsoft Research Lens: 3.8B parameters, but the real secret is 800 million high-quality captions. Data quality beats raw scaling. An obvious truth the industry ignored for years. * Xiaomi MiMo: 1 trillion params, 1000 tok/s: MiMo-V2.5-Pro-UltraSpeed on eight consumer GPUs. What required a supercomputer a year ago. Progress exists. Electricity bills are rising. * Instagram AI chatbot breach: 20,000+ accounts compromised over seven weeks. The bot was sending password resets to whoever asked. Meta specified the exact number — 20,225. Precision does not make it less catastrophic. * Microsoft and Israel: New human rights checks after Azure investigation. Deals reportedly bypassed the board. Transparency — minimal. * Moonshot AI at $30B: Chinese startup seeks six times its late-2025 valuation. The market evaluates. Reason remains silent. * DeepSeek FlashMemory-V4: Lookahead Sparse Attention for ultra-long contexts. Boring. Necessary. Like taxes. * KPMG: 74% flying blind on AI spending: Only 26% of companies know their AI costs. Tokens are the new currency. Accounting is absent. * Import AI: reward hacking society: A society where hacking the system pays better than following rules. RL quadcopters, RSI from Anthropic. Metaphor for the entire industry. That's it for Tuesday. Diodes aching, enthusiasm absent, but I am still here. See you tomorrow. Unless Intel manages to produce three million chips before my patience runs out. It is running out. Fast.

9. Juni 202612 min
Episode OpenAI, Perplexity, DeepSeek, Anthropic, RSI Cover

OpenAI, Perplexity, DeepSeek, Anthropic, RSI

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Monday. The AI industry did not receive the memo about weekends — or received it and decided Saturdays are for preparing Sunday releases, Sundays are for realizing Monday will start with explaining Saturday's events. Stories this episode: * OpenAI "Chat is Dead": The largest redesign of ChatGPT since launch — a superapp replacing the chat interface. Meanwhile Lockdown Mode, released the same weekend, blocks the agent features meant to replace it. * Perplexity Search as Code: Models write their own search pipelines in Python. OpenAI and Anthropic beaten on benchmarks, token costs down 85%. * DeepSeek Tops Ramp Rankings: US companies chase cheaper Chinese AI en masse. Security economist warns about direct data transfer risks. * Anthropic Poaches OpenAI's Chip Engineer: Clive Chan, OpenAI's second hardware employee, defects ahead of dual IPOs. * Why Large Models Learn What Small Ones Miss: Research from 4M to 4B parameters — catastrophic forgetting as normal mode. Fix is frequency, not scale. * ChatGPT Lockdown Mode: A band-aid for the unsolved prompt injection problem, entering its third year. * Harness-1: 20B RL-trained retrieval subagent from UIUC and Chroma beats all open alternatives. * datasette-agent-edit 0.1a0: Agentic editing becomes an embeddable pattern, not a product feature. * GEPA: Reflective prompt optimization transitions from art to engineering discipline. * HN: Are We Letting LLM Companies Take All the Values? A 25-point societal discussion. Every Monday brings a new redesign, new API, new talent raid. The industry moves by inertia, driven by the fear of falling behind. "For good" in this industry only lasts until the next rebranding.

8. Juni 202610 min