Ep 52: NVIDIA's 4-bit pretraining technique cuts memory needs for large hybrid models while keeping accuracy nearly identical to FP8.

7 min · 18. Mai 2026

Beschreibung

Models & Agents NVIDIA's 4-bit pretraining technique cuts memory needs for large hybrid models while keeping accuracy nearly identical to FP8. What You Need to Know: NVIDIA released a full 4-bit pretraining stack (NVFP4) that was validated on a 12B Mamba-Transformer trained for 10 trillion tokens. The approach combines selective BF16 layers, Hadamard transforms, and stochastic rounding to stay within 0.04 points of an FP8 baseline on MMLU-Pro. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

Kommentare

Sei die erste Person, die kommentiert

Melde dich jetzt an und werde Teil der Models & Agents-Community!

Loslegen

Alle Folgen

60 Folgen

Ep 60: Local builders can now treat markdown skill files as optimizable parameters with automated validation gates instead of manual tweaking.

Models & Agents Local builders can now treat markdown skill files as optimizable parameters with automated validation gates instead of manual tweaking. What You Need to Know: A new paper formalizes SkillOpt, using frontier models to propose bounded edits to markdown skills and accepting only those that improve a held-out validation set. Qwen3.5 and Qwen3.6 receive new uncensored and diffusion variants with detailed training notes for consumer hardware. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

Gestern11 min

Ep 59: Datasette's new slash-key jump menu now launches agent conversations directly from your databases.

Models & Agents Datasette's new slash-key jump menu now launches agent conversations directly from your databases. What You Need to Know: Simon Willison shipped Datasette 1.0a30 with a keyboard-driven "jump to" menu that plugins can extend, plus a datasette-agent plugin that adds a conversation starter form. NuExtract3, a new 4B vision-language model, arrived on Hugging Face for structured extraction and Markdown conversion from documents. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

25. Mai 202610 min

Ep 58: Looking back at 6 episodes from 2026-05-18 to 2026-05-24 — the stories that mattered, what we learned, and what to watch next.

Models & Agents — Weekly Recap Looking back at 6 episodes from 2026-05-18 to 2026-05-24 — the stories that mattered, what we learned, and what to watch next. This Week's Top Stories From Ep 52 (2026-05-18): What You Need to Know: What You Need to Know: NVIDIA released a full 4-bit pretraining stack (NVFP4) that was validated on a 12B Mamba-Transformer trained for 10 trillion tokens. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

24. Mai 20267 min

Ep 57: OpenAI just added goal mode and screen-aware context to Codex, letting agents work autonomously for hours on real tasks.

Models & Agents OpenAI just added goal mode and screen-aware context to Codex, letting agents work autonomously for hours on real tasks. What You Need to Know: OpenAI rolled out Goal mode, Appshots, and advanced annotation in Codex across app, IDE, and CLI. Anthropic reported finding over 10,000 high-severity vulnerabilities through Project Glasswing using Claude models. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

23. Mai 202611 min

Ep 56: OpenAI just gave Codex the ability to control locked Macs and run multi-day goals, turning it into a true background agent you can launch from your phone.

Models & Agents OpenAI just gave Codex the ability to control locked Macs and run multi-day goals, turning it into a true background agent you can launch from your phone. What You Need to Know: OpenAI shipped several Codex updates today including secure computer use on locked Macs, Goal mode for hours-long autonomous work, and advanced annotation tools. Microsoft released Fara1.5, a family of browser agents that beat OpenAI Operator and Gemini 2.5 on web tasks. ... AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

22. Mai 202610 min

Ep 52: NVIDIA's 4-bit pretraining technique cuts memory needs for large hybrid models while keeping accuracy nearly identical to FP8.

Beschreibung

Kommentare

2 Monate für 1 €

Alle Folgen