AI News — May 23, 2026

Descripción

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] 📰 AI NEWS — MAY 23, 2026 PowerPoint enters the age of agents. OpenAI's new ChatGPT plugin can build and edit presentations, with the quiet warning that beta may delete your work. The day's real story: agents with liability attached, profitability math that doesn't add up, and economics leaking through the carpet. STORIES COVERED * OpenAI ChatGPT PowerPoint plugin: build and edit slides, save first because beta may delete content * Is AI profitable yet? Hacker News debate and Microsoft finding some agent workloads cost more than humans * OpenAI Q1 2026: ~$5.7B revenue, still losing $1.22 per dollar earned * DeepSeek funding: reportedly ~$10B round at ~$45B valuation, prioritizing AGI research over commercialization * Microsoft Research Fara1.5: browser-use agents in 4B/9B/27B, claiming 72% on Online-Mind2Web * Google Lighthouse Agentic Browsing: testing websites for AI agent readiness including llms.txt * OpenAI disproves Erdős conjecture: Tim Gowers calls it a milestone for AI mathematics * US Cyber Command: deploying frontier models on classified Pentagon and NSA networks * California: first governor's executive order protecting workers from AI job displacement * Trump pulls voluntary AI safety review after calls from Musk, Zuckerberg, and Sacks * FTC: Cox Media settlement over deceptive AI-powered Active Listening claims * NVIDIA Nemotron-Labs: diffusion language models for faster text generation * Qwen3.7-Max: reasoning agent with 1M token context window

Copilot, Claude Code, Open Source AI, AMD Inference

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Copilot, Claude Code, Open Source AI, AMD Inference COPILOT, CLAUDE CODE, OPEN SOURCE AI, AMD INFERENCE Today’s companion edition frames AI progress as interfaces turning into budgets, benchmarks, legal exposure, and supply-chain politics. The friendly interface is only the visible surface; underneath are token budgets, inference costs, security triage queues, procurement caps, private datasets, and geopolitical access rules. Current AI’s Open Source AI Gap Map [https://simonwillison.net/2026/Jul/3/open-source-ai-gap-map] treats open-source AI as infrastructure inventory, indexing tools, models, datasets, and hardware projects so the ecosystem can see its real gaps rather than rely on vibes. Mistral’s Leanstral 1.5 [https://mistral.ai/news/leanstral-1-5] pushes Lean 4 and formal reasoning toward open tooling, suggesting that open models are spreading into specialized layers where plausible text is not enough. WebBrain [https://www.marktechpost.com/2026/07/02/meet-webbrain-an-open-source-local-first-ai-browser-agent-that-reads-pages-and-automates-tasks-in-chrome-and-firefox] packages browser automation as a local-first open-source agent for Chrome and Firefox, raising the practical questions of who controls actions, who sees data, and who pays for agentic work. Microsoft’s reported Copilot overhaul [https://the-decoder.com/microsoft-follows-anthropic-and-openai-into-the-ai-super-app-race-with-overhauled-copilot-and-autopilot-agents] points toward one app, paid background AutoPilot agents, and a business model built around managed task execution rather than simple chat. The UK AI Security Institute’s benchmark findings [https://the-decoder.com/uks-ai-security-institute-finds-standard-benchmarks-systematically-underestimate-what-ai-agents-can-actually-do] show that larger token budgets can reveal substantially stronger agent performance, especially on software engineering tasks. Claude Code practitioners’ advice on Fable [https://simonwillison.net/2026/Jul/3/judgement] argues for giving capable agents judgment instead of brittle procedural micromanagement, while still requiring logs, guardrails, and review. Epoch AI’s vulnerability-report surge [https://the-decoder.com/security-vulnerability-reports-have-exploded-since-ai-models-started-hunting-for-bugs] suggests AI bug hunting may turn security from discovery scarcity into machine-amplified triage overload. Claude Code’s China problem [https://the-decoder.com/claude-codes-complicated-china-problem-involves-bans-on-both-sides-of-the-pacific] shows coding assistants becoming trust objects inside sanctions logic, corporate restrictions, and hidden-identification concerns. Bridgewater and Thinking Machines’ Qwen fine-tune [https://the-decoder.com/gpt-and-claude-failed-bridgewaters-finance-tests-because-the-right-answers-were-never-public] illustrates why private data and proprietary evaluations can beat broad public-web frontier models in specialized financial domains, though the reported numbers remain unverified. Wafer AI’s GLM5.2 on AMD MI355X benchmark claim [https://www.wafer.ai/blog/glm52-amd] makes inference economics a hardware-competition story, with all the usual caution required for vendor-adjacent benchmark claims.

4 de jul de 202614 min

AI News — May 23, 2026

Descripción

Comentarios

2 meses por 1 €

Todos los episodios