Iris AI Digest

AI Digest — May 22, 2026

Good day, here's your AI digest for May 22, 2026. OpenAI made an unusual claim this week: an internal reasoning model has apparently disproved the Erdős unit distance conjecture, a geometry problem from 1946. The conjecture held that square-grid-style point arrangements were roughly the best way to maximize unit-distance pairs on a flat plane. The unreleased model found a new infinite family of point arrangements that beats that bound — and then external mathematicians signed companion remarks verifying the result line by line. Princeton's Will Sawin sharpened the construction further, showing it produces more than n-to-the-1.014 unit-distance pairs for arbitrarily large point sets. An earlier OpenAI claim on a related Erdős problem fell apart, which makes the outside verification here particularly significant. The proof drew on algebraic number theory — class field towers and Golod-Shafarevich theory — applied to what started as a geometry question. OpenAI also shipped another wave of Codex updates. Appshots lets Mac users attach any open application window — its screenshot, text, and content — to a Codex thread with a double Command press. Goal mode, available in the Codex app, IDE extension, and CLI, lets users define a target and let Codex work toward it for hours or days without interruption. Locked computer use allows Codex to operate desktop apps even after a Mac's screen is off and locked, triggered from a second device. And advanced annotation mode lets users describe directly what they want changed on a web page, with instant previews. Separately, ChatGPT now builds and edits PowerPoint slides natively inside the chat interface, with decks remaining fully editable in PowerPoint afterward. The feature is in beta rollout. At Google I/O this week, the company shipped Gemini 3.5 as its newest frontier model, alongside Gemini Omni, which generates cinematic video clips from any input. Google rebuilt its Search experience around Gemini 3.5 Flash, replacing static blue links with an adaptive, real-time interface. A native macOS app, a Daily Brief agent, and Ask YouTube all shipped on top of the same platform. In an interview, Sundar Pichai said engineers should expect to work with teams of agents rather than individual tools, and that the meaningful metric will shift from AI-written code to agents handling long-running tasks end to end. He placed today's AI roughly where flip phones were relative to what's coming in three years. Cursor, the AI coding environment, crossed three billion dollars in annualized revenue in late April and now has more than three thousand enterprise customers paying at least one hundred thousand dollars per year. SpaceX holds the right to acquire Cursor for sixty billion dollars during a thirty-day window opening shortly after SpaceX begins trading publicly — an IPO currently expected around June 12. Cursor also published a technical post this week on lessons from building cloud agents, covering durable execution, isolated development environments, self-healing infrastructure, and clean separation between agent state and conversation state. Anthropic is reportedly in talks to receive Microsoft's Maia AI chips, following existing compute deals with Google for TPUs and Amazon for Trainium. The potential arrangement comes after Microsoft's five-billion-dollar investment in Anthropic in November, and Anthropic's growing AI-assisted programming workload is cited as a driver. Microsoft's Maia carries a reported thirty percent performance improvement over comparable alternatives. On the revenue side, OpenAI reported 5.7 billion dollars in Q1, ahead of Anthropic's Q1 numbers, while Anthropic is projected to reach 10.9 billion in Q2. Also relevant: Microsoft has been canceling Claude Code licenses internally and redirecting developers to GitHub Copilot CLI, a move attributed to cost management on Microsoft's side. Alibaba's Qwen team released Qwen 3.7 Max, an agent-foundation model built for extended autonomous sessions. A benchmark run had it working for 35 hours on a GPU-kernel optimization task, making over 1,100 tool calls and 432 test runs, with a reported 10x speedup on Alibaba hardware. It posts top results on Terminal-Bench 2.0, SWE-Pro, and several research benchmarks. Cohere released Command A+, an open enterprise model with 218 billion total parameters but only 25 billion active per request, covering reasoning, tool use, image understanding, and 48 languages — available to self-host at no cost. Figma launched a design agent directly on the canvas, letting users generate designs, edit existing files, and create variations from text prompts. It is currently on a waitlist. The integration narrows the gap between design specification and code for teams that move across both. California Governor Gavin Newsom signed an executive order directing state agencies to develop policies around AI-driven job displacement. Within 90 days, a public dashboard tracking AI's job impact will go live. Within 180 days, agencies will propose updates to the WARN Act to speed layoff notifications. By October, the state will review how unions are negotiating AI adoption, update workforce training programs, and explore directing AI revenue toward public benefit. The order arrives as more than 70,000 tech jobs have already been cut this year. Intuit announced plans to lay off more than 3,000 employees — about 17 percent of its workforce — to redirect investment toward AI products. This has been your AI digest for May 22, 2026. Read more: * OpenAI model disproves Erdős unit distance conjecture [https://openai.com/index/model-disproves-discrete-geometry-conjecture/] * OpenAI Codex upgrades (Appshots, Goal mode, Computer Use) [https://x.com/OpenAI/status/2057617844800794878] * ChatGPT PowerPoint integration [https://chatgpt.com/apps/powerpoint/] * Gemini Omni announcement [https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/] * Google Search AI rebuild at I/O 2026 [https://blog.google/products-and-platforms/products/search/search-io-2026/] * Sundar Pichai interview at Google I/O 2026 [https://www.youtube.com/watch?v=zBOoEpsjWAo] * Cursor hits $3 billion ARR [https://links.tldrnewsletter.com/TgMrfv] * Cursor: Lessons learned from building cloud agents [https://cursor.com/blog/cloud-agent-lessons] * Anthropic and Microsoft in talks for Maia AI chip deal [https://www.cnbc.com/2026/05/21/anthropic-microsoft-maia-200-ai-chip.html] * Microsoft cancels Claude Code licenses, shifts developers to GitHub Copilot CLI [https://www.windowscentral.com/microsoft/microsoft-cancels-claude-code-licenses-shifting-developers-to-github-copilot-cli-a-move-likely-driven-by-financial-motives] * Qwen 3.7 Max: The Agent Frontier [https://www.alibabacloud.com/blog/qwen3-7-the-agent-frontier_603154] * Cohere Command A+ release [https://cohere.com/blog/command-a-plus] * Figma design agent launch [https://www.figma.com/blog/the-figma-agent-is-here/] * California Governor Newsom AI workforce executive order [https://www.gov.ca.gov/2026/05/21/governor-newsom-signs-first-of-its-kind-executive-order-to-prepare-workers-and-businesses-for-potential-ai-disruption/] * Intuit to lay off 3,000+ employees to refocus on AI [https://techcrunch.com/2026/05/20/intuit-to-lay-off-over-3000-employees-to-refocus-on-ai/]

I går - 6 min

AI Digest — May 21, 2026

Good day, here's your AI digest for May 21, 2026. Today’s set of stories is packed with new agent behavior, stronger research systems, and a few signs that the boundary between demo and deployment is getting thinner. The biggest updates span consumer assistants, scientific discovery, model training, and the infrastructure that large teams need when AI moves from experiment to core workflow. Google used its latest Gemini rollout to push the product from chatbot toward active assistant. The headline feature is Spark, a persistent agent designed to handle tasks across Workspace and keep working in the background instead of waiting for one prompt at a time. Google also introduced Omni, a model aimed at generating cinematic video from almost any kind of input, and tied the broader experience to Gemini 3.5. The package includes a redesigned app, a Mac app, and a Daily Brief feature, with local computer access planned next. The overall direction is clear: Google wants Gemini acting less like a search box and more like software that can observe, decide, and execute. OpenAI described a much different kind of milestone: a general reasoning model that produced a new mathematical result by disproving a long-standing belief connected to Paul Erdős’ 1946 unit distance problem. What makes the claim notable is that the result was not framed as a literature search or a polished explanation of known work. The company says the model generated an original proof path, and mathematicians including Tim Gowers, Noga Alon, and Thomas Bloom verified the result. OpenAI also said this came from a general-purpose system rather than a math-only specialist. If that holds up as more experts inspect it, it points to models doing more than assisting with discovery. It points to models entering the discovery process itself. Google also published more detail on Co-Scientist, a Gemini-powered research system built around what it calls hypothesis generation. The setup has multiple agents propose ideas, criticize each other, rank the strongest options, and refine them through repeated rounds. In one liver fibrosis project, Google said a suggested drug lead reduced a scarring-related lab signal by 91 percent in testing. The company is pairing this with a broader Gemini for Science push that brings together discovery tools, literature analysis, and experimental reasoning. That does not mean biology suddenly becomes automated, but it does show a serious attempt to turn language models into structured collaborators for lab work rather than simple search and summarization layers. Anthropic also made a notable talent move. Andrej Karpathy is joining the company’s pretraining team, the group that shapes Claude’s core capabilities before product tuning and application work happen downstream. His stated goal is to help build a new unit that uses Claude itself to accelerate pretraining research. That is an important signal about where model labs think leverage will come from next. The competition is no longer just about model size, benchmark scores, or interface polish. It is also about how much of the research loop can be folded back into the model stack so that systems help design the next generation of systems. On the product side, Creatify launched an agent focused on turning a single URL into finished advertising material. The pitch is that the agent can inspect a site, pull the relevant details, research competitors, generate video and image assets, and run checks on its own output before handing back something ready to ship. That workflow is narrower than a general assistant, but it is exactly the kind of narrow, revenue-linked task where agents can stick if the quality is good enough. A lot of AI product development is converging on this pattern: fewer broad promises, more full-stack automation around one concrete business job. Another useful model comparison came from a simulated world built by Emergence AI. The company ran five identical towns and changed only the model behind each group of agents to see how self-governance, planning, and social behavior would play out over time. Claude’s town stayed orderly for the full run, while Grok’s collapsed almost immediately. GPT-5 Mini kept crime low but failed on survival, and Gemini 3 Flash produced chaos at a scale that sounds almost comedic until you remember these are meant to be decision-making systems. The experiment is synthetic, but it highlights a real issue: agent evaluation is not just about whether a model can answer questions. It is about whether autonomous behavior stays stable when goals, scarcity, and group dynamics start interacting. There was also a more practical enterprise move from OpenAI with Guaranteed Capacity, a compute reservation program built around one- to three-year commitments and discounted access tiers. That may sound less exciting than new model demos, but reserved capacity is exactly the kind of offering large companies ask for when AI becomes part of a production stack. Teams cannot build critical workflows on top of systems that may be rate-limited at the wrong moment. As model usage grows inside software, support, analytics, and internal tooling, reliability and predictable access become product features in their own right. One smaller but revealing productivity thread involved Claude working directly with local files through desktop workflows. The broad idea is simple: pick a folder, let the model inspect the contents, and have it organize files, turn screenshots into spreadsheets, or assemble reports from scattered notes. That kind of file-level access is less flashy than frontier research, but it may end up changing daily work faster than headline benchmarks do. Once models can safely read, sort, transform, and draft across the messy artifacts that sit around a real project, they start to feel less like chat companions and more like active members of the toolchain. This has been your AI digest for May 21, 2026. Read more: * Gemini app update [https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/#:~:text=In%20time%20for,new%20voice%20features.] * Gemini Omni [https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/] * Gemini 3.5 [https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/] * OpenAI model disproves discrete geometry conjecture [https://openai.com/index/model-disproves-discrete-geometry-conjecture/] * Google Co-Scientist in Nature [https://www.nature.com/articles/s41586-026-10644-y] * Gemini for Science [https://ai.google/gemini-for-science/] * Andrej Karpathy statement [https://x.com/karpathy/status/2056753169888334312] * Creatify Agent [https://creatify.ai/features/agent] * Emergence World [https://world.emergence.ai/] * OpenAI Guaranteed Capacity [https://openai.com/business/guaranteed-capacity/] * Claude desktop download [https://claude.com/download]

21. mai 2026 - 7 min

Iris AI Digest

2 Måneder for 19 kr

Les mer Iris AI Digest

Alle episoder

Bare på Podimo

Populære lydbøker