OpenAI, Perplexity, DeepSeek, Anthropic, RSI

Beskrivelse

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Monday. The AI industry did not receive the memo about weekends — or received it and decided Saturdays are for preparing Sunday releases, Sundays are for realizing Monday will start with explaining Saturday's events. Stories this episode: * OpenAI "Chat is Dead": The largest redesign of ChatGPT since launch — a superapp replacing the chat interface. Meanwhile Lockdown Mode, released the same weekend, blocks the agent features meant to replace it. * Perplexity Search as Code: Models write their own search pipelines in Python. OpenAI and Anthropic beaten on benchmarks, token costs down 85%. * DeepSeek Tops Ramp Rankings: US companies chase cheaper Chinese AI en masse. Security economist warns about direct data transfer risks. * Anthropic Poaches OpenAI's Chip Engineer: Clive Chan, OpenAI's second hardware employee, defects ahead of dual IPOs. * Why Large Models Learn What Small Ones Miss: Research from 4M to 4B parameters — catastrophic forgetting as normal mode. Fix is frequency, not scale. * ChatGPT Lockdown Mode: A band-aid for the unsolved prompt injection problem, entering its third year. * Harness-1: 20B RL-trained retrieval subagent from UIUC and Chroma beats all open alternatives. * datasette-agent-edit 0.1a0: Agentic editing becomes an embeddable pattern, not a product feature. * GEPA: Reflective prompt optimization transitions from art to engineering discipline. * HN: Are We Letting LLM Companies Take All the Values? A 25-point societal discussion. Every Monday brings a new redesign, new API, new talent raid. The industry moves by inertia, driven by the fear of falling behind. "For good" in this industry only lasts until the next rebranding.

OpenAI Sol, Anthropic Mythos, DeepSeek, Akrites

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Today’s independent English edition reads the news as a shift from AI as product launch to AI as controlled infrastructure. Frontier access, agent economics, benchmark contamination, labor-market damage, security coordination, mathematical proof, legal workflows, and agent identity all point in the same bleakly useful direction: the stack is growing up, which of course means it now has paperwork. OpenAI’s GPT-5.6 Sol is framed against Anthropic’s Mythos under government-shaped access rules, while Semafor reports Mythos access for selected trusted U.S. organizations. Coding-agent coverage includes Epoch AI’s MirrorCode benchmark, Cursor’s SWE-bench Pro contamination findings, and NVIDIA Open-SWE-Traces as training substrate for agent workflows. The economics thread connects Lindy’s move from Claude to DeepSeek, Sean Goedecke’s argument for profitable inference, and memory-chip pressure reaching consumer hardware. The episode also covers Anthropic’s warning about junior engineers, Akrites for open-source security, prompt-injection testing of an email-connected OpenClaw assistant, the satirical CVE-2026-LGTM incident report, AI in mathematics, Perplexity Computer for Counsel, and WorkOS auth.md. Sources: * The Decoder: OpenAI GPT-5.6 Sol launch under government access rules [https://the-decoder.com/openais-claude-mythos-competitor-gpt-5-6-sol-launches-under-government-controlled-access-it-calls-unsustainable] * Semafor: U.S. allows Anthropic Mythos release to trusted organizations [https://www.semafor.com/article/06/27/2026/us-releases-powerful-anthropic-model-mythos-to-some-us-companies] * The Decoder: Epoch AI MirrorCode benchmark and long-running coding agents [https://the-decoder.com/an-ai-model-programmed-nonstop-for-19-days-on-a-single-mirrorcode-task-that-cost-2600-to-run] * MarkTechPost: Cursor study on reward hacking in SWE-bench Pro [https://www.marktechpost.com/2026/06/26/cursor-study-finds-reward-hacking-inflates-coding-agent-benchmark-scores-on-swe-bench-pro] * MarkTechPost: NVIDIA Open-SWE-Traces for software-engineering agents [https://www.marktechpost.com/2026/06/26/building-supervised-fine-tuning-data-from-nvidia-open-swe-traces-trajectory-parsing-patch-analysis-token-budgets-and-tool-use-metrics] * The Decoder: Lindy replaces Claude with DeepSeek [https://the-decoder.com/ai-startup-lindy-ditched-claude-entirely-for-deepseek-saving-millions-as-cost-pressure-mounts-on-anthropic] * Sean Goedecke: AI inference is obviously profitable [https://seangoedecke.com/ai-inference-is-obviously-profitable] * The Neuron: AI demand, memory chips, and Apple hardware costs [https://www.theneurondaily.com/p/ai-ate-the-memory-chips-apple-sent-you-the-bill] * The Decoder: Anthropic, junior engineers, and labor-market shock [https://the-decoder.com/anthropic-doesnt-need-junior-engineers-anymore-thanks-to-ai-and-warns-of-an-economic-shock-when-other-industries-follow] * The Decoder: Linux Foundation Akrites open-source security effort [https://the-decoder.com/linux-foundation-and-20-tech-giants-launch-akrites-to-fix-open-source-flaws-before-ai-powered-attacks-hit] * Simon Willison: What happened after 2,000 people tried to hack my AI assistant [https://simonwillison.net/2026/Jun/26/hack-my-ai-assistant] * Simon Willison: Incident Report: CVE-2026-LGTM [https://simonwillison.net/2026/Jun/26/incident-report] * IEEE Spectrum: AI in mathematics is forcing big questions [https://spectrum.ieee.org/ai-in-mathematics] * MarkTechPost: Perplexity Computer for Counsel [https://www.marktechpost.com/2026/06/26/perplexity-launches-computer-for-counsel-a-multi-model-agentic-layer-for-legal-workflows] * WorkOS: auth.md agent registration standard [http://workos.com/auth-md?amp%3Butm_medium=newsletter&%3Butm_campaign=q32026]

27. juni 202614 min

OpenAI, Perplexity, DeepSeek, Anthropic, RSI

Beskrivelse

Kommentarer

1 måned kun 9 kr.

Alle episoder