The Neural Deep Dive 2026-06-03: Killing the VRAM Tax

23 min · 3. juni 2026

Description

The memory wall has finally met its match. This week, we tear down TurboQuant, the new two-stage compression breakthrough from Google and Tether that achieves a massive 5x reduction in KV cache footprints to squeeze 70B models onto consumer-grade GPUs. From the technical wizardry of random orthogonal rotations to the market-disrupting implications for NVIDIA and local inference, we’re exploring whether this is a true "democratization" of AI or just a strategic power play by Tether.

Comments

Be the first to comment

Get Started

All episodes

193 episodes

The Neural Daily 2026-07-05: Sovereign Weights and Robot Armies

From Sam Altman’s gamble to give the U.S. government an equity stake to the rise of "Sovereign AI" with Mistral, the battle for control over the weights is heating up. We also dive into Agibot’s massive humanoid robot rollout and Google’s pivot toward "administrative plumbing" in its latest AI branding. The era of chatting with bots is over—welcome to the age of agentic action.

5. juli 202614 min

The Neural Daily 2026-07-04: AI Jailbreaks & Token Traps

Anthropic’s Fable 5 returns from a government blackout, but is the new "Cyber Jailbreak" framework enough to keep it safe? We dive into the rise of "Agentic PCs," the shift toward modular AI factories, and SAP’s aggressive pivot toward the Autonomous Enterprise. Plus, we debate the "AI literacy gap" and Hayao Miyazaki’s scathing critique of generative art.

Yesterday15 min

The Neural Daily 2026-07-03: Sandwich Tech and Server Epidemics

Meta enters the cloud wars with a massive new storage blueprint and the launch of Meta Compute, while OpenAI hits a historic one billion users. We also dive into the "bloodbath" of AI-driven middle management layoffs and the absurdity of a sandwich shop attempting a $12 billion "tech" IPO. Plus, a deep dive into the 18-year-old bug haunting the AI revolution and Anthropic's latest agentic breakthroughs.

3. juli 202613 min

The Neural Daily 2026-07-02: Tokenmaxxing and Super-Hacker Bots

From Alex Karp’s "nuclear" critique of "tokenmaxxing" to the high-stakes preview of GPT-5.6 Sol, the AI arms race is hitting a fever pitch. We dive into Meta’s massive cloud infrastructure pivot, the controversy surrounding OpenAI’s alleged government equity stakes, and the shift toward a "pay-per-use" agentic internet. Plus, we explore the cutting edge of generative video and 3D scenes in our "Beyond the Token" segment.

2. juli 202615 min

The Neural Deep Dive 2026-07-01: Native Multimodality and Agentic Gods

Ditch the "Frankenstein" architecture for native multimodality. We're dissecting the MiniMax M3, from its "Step Zero" training and million-token context window to its brutal dominance over GPT-5.5 and Gemini 3.1 Pro on the SWE-Bench. Is the M3's ability to autonomously optimize CUDA kernels a productivity miracle or a death knell for high-end engineering? Join us as we dive into Sparse Attention, Mixture-of-Experts, and the shifting power dynamics of the global AI arms race.

1. juli 202621 min

The Neural Deep Dive 2026-06-03: Killing the VRAM Tax

Description

Comments

1 month for 9 kr.

All episodes