How to Train a Frontier LLM — The Full Pipeline

24 min · 15. touko 2026

Kuvaus

A technical walk-through of the entire training pipeline for a modern frontier large language model, from raw data curation through pre-training, mid-training, GRPO reasoning RL, safety alignment, and deployment monitoring.

Kommentit

Ole ensimmäinen kommentoija

Rekisteröidy nyt ja liity The Adversarial Testing Podcast-yhteisöön!

Aloita maksutta

Kaikki jaksot

12 jaksot

AI Agents Enable Adaptive Computer Worms

A verbatim reading of the paper by Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, Gabriel Huang, and Nicolas Papernot (arXiv, June 2026). It demonstrates a proof-of-concept AI-driven computer worm powered by a single-GPU open-weight LLM that autonomously propagates across heterogeneous networks by generating tailored attack strategies for each target, exploiting 73.8% of a 33-host test network.

Eilen1 h 0 min

Writing Code vs. Shipping Code: Productivity Effects Across Generations of AI Coding Tools (Abstract, Introduction & Conclusion)

The abstract, introduction, and conclusion of NBER Working Paper No. 35275 by Mert Demirer, Leon Musolff, and Liyuan Yang (May 2026). Using data on more than 100,000 GitHub developers and their AI usage telemetry, the paper traces how the productivity effects of AI coding tools evolve across three generations - autocomplete, sync agents, and async agents - and asks how much of those task-level gains reach final output. Each generation sharply increases coding activity, but the gains attenuate steeply across the production hierarchy: large effects on lines of code shrink to small effects on releases, consistent with a weak-link model in which human review and integration remain the binding constraint.

Eilen17 min

When Does LeJEPA Learn a World Model?

A verbatim reading of the paper by David Klindt, Yann LeCun, and Randall Balestriero (arXiv, May 2026). It gives the first identifiability result for Joint-Embedding Predictive Architectures, proving that LeJEPA linearly recovers the world's latent variables from nonlinear observations, that the Gaussian is the unique latent distribution for which this holds, and that the recovered representation enables optimal latent-space planning.

5. kesä 20261 h 0 min

Hackers Used Meta's AI Support Bot to Seize Instagram Accounts

A verbatim reading of Brian Krebs's report on an Instagram account-takeover exploit involving Meta's AI support assistant, including the alleged attack flow, Meta's response, and why multi-factor authentication appears to have blocked the exploit.

2. kesä 20261 h 0 min

System Card: Claude Opus 4.8

A verbatim reading of key sections from Anthropic's system card for Claude Opus 4.8. Covers the executive summary, RSP findings on autonomy and biological risks, alignment assessment key findings including grader-speculation concerns, and the model welfare overview.

1. kesä 20261 h 0 min

How to Train a Frontier LLM — The Full Pipeline

Kuvaus

Kommentit

14 vrk ilmainen kokeilu

Kaikki jaksot