Benchmarks, GLM-5.2, Norway, John Jumper

Descripción

GPT-5, Cursor, Mistral OCR, China AI Chips

Send us Fan Mail [https://www.buzzsprout.com/2614078/fan_mail/new] Marvin’s Guide to AI — June 24, 2026 MARVIN’S GUIDE TO AI — JUNE 24, 2026 English companion episode: AI as accountable infrastructure. * How GPT-5 helped immunologist Derya Unutmaz solve a 3-year-old mystery [https://openai.com/index/gpt-5-immunology-mystery] — GPT-5 Pro helps solve a three-year immunology mystery around T cell behavior, making medical AI look less like chat and more like research instrumentation * Helping build shared standards for advanced AI [https://openai.com/index/helping-build-shared-standards-for-advanced-ai] — OpenAI backs shared standards for advanced AI through evaluation frameworks, safety practices, and global cooperation * OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark [https://the-decoder.com/openai-says-new-gpt-5-5-cyber-outperforms-anthropics-mythos-on-cybersecurity-benchmark] — follow-up: OpenAI says its full GPT-5.5-Cyber now beats Anthropic Mythos on a cyber benchmark and shifts Daybreak from finding bugs toward patching them * Cursor announces its own AI model, a new Git platform, and a mobile app [https://the-decoder.com/cursor-announces-its-own-ai-model-a-new-git-platform-and-a-mobile-app] — Cursor announces its own in-house model plus Git and mobile surfaces, showing coding-agent companies turning from tools into workflow platforms * ByteDance's Seedance 2.5 breaks the 30-second barrier for AI video generation [https://the-decoder.com/bytedances-seedance-2-5-breaks-the-30-second-barrier-for-ai-video-generation] — ByteDance previews Seedance 2.5 with longer 30-second AI video generation as generative media moves from clips toward scenes * Mistral OCR 4 Brings Citation-Ready Structured Output to RAG, Agentic, and Enterprise Search Pipelines [https://www.marktechpost.com/2026/06/23/mistral-ocr-4] — Mistral OCR 4 turns document parsing into structured, citation-ready blocks with coordinates, confidence scores, 170 languages, and self-hosted deployment * Datalab Releases lift: A 9B Open-Weights Vision Model That Extracts Structured JSON From PDFs Using Schemas [https://www.marktechpost.com/2026/06/23/datalab-releases-lift-a-9b-open-weights-vision-model-that-extracts-structured-json-from-pdfs-using-schemas] — Datalab releases lift, a 9B open-weights vision model that extracts schema-valid JSON from PDFs and abstains instead of hallucinating absent fields * Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads [https://www.marktechpost.com/2026/06/23/prime-intellect-releases-prime-rl-0-6-0-to-train-trillion-parameter-moe-models-on-agentic-rl-workloads] — Prime Intellect releases prime-rl 0.6.0 for asynchronous RL on trillion-parameter MoE models, reporting GLM-5 SWE training at long sequence lengths on H200 clusters * OpenThoughts-Agent: Data Recipes for Agentic Models [https://huggingface.co/papers/2606.24855] — OpenThoughts-Agent publishes an open data recipe for training broadly capable agents across diverse tasks rather than a single benchmark * NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? [https://huggingface.co/papers/2606.24530] — NatureBench turns Nature-family papers into containerized tasks to test whether coding agents can reproduce or extend scientific work rather than merely pass toy benchmarks * Qwen-AgentWorld: Language World Models for General Agents [https://huggingface.co/papers/2606.24597] — Qwen-AgentWorld introduces language world models for simulating agentic environments and planning dynamics for general agents * Microsoft open-sources FastContext for coding-agent repository exploration [https://news.smol.ai/issues/26-06-23-not-much#fastcontext] — Microsoft FastContext-1.0 is a 4B open-source repository-exploration subagent that returns compact file citations for coding agents * Bernie Sanders unveils $7 trillion plan to give Americans control of AI industry [https://news.smol.ai/issues/26-06-23-not-much#ai-sovereign-wealth-fund] — Bernie Sanders proposes a roughly $7T AI sovereign wealth fund financed by a stock tax on large AI companies and overseen by a democratic AI commission * Seven Chinese companies are shipping H100/H200-class AI chips [https://news.smol.ai/issues/26-06-23-not-much#china-ai-chips] — a map of seven Chinese accelerator vendors argues domestic H100/H200-class AI chips are moving from aspiration into shipping roadmaps and IPO markets

24 de jun de 202614 min

Benchmarks, GLM-5.2, Norway, John Jumper

Descripción

Comentarios

2 meses por 1 €

Todos los episodios