From Visualizations to Circuits: The Origins of the Glass Box (The Glass Box Pt. 2)

23 min · 9. feb. 2026

Beskrivelse

Is AI just a "giant math soup" that happens to work, or is it a machine with parts we can understand? In Part 2 of The Glass Box series, we dig into the origin story of Mechanistic Interpretability. We trace the 7-year arc from the first "camera" that let us see inside a neural network (Zeiler & Fergus, 2013) to the "microscope" of Feature Visualization (Olah et al., 2017) and the eventual "blueprint" of Circuits (2020). We break down the four foundational papers that gave us the instruction manual for auditing AI and the plot twist (Superposition) that reminded us that the brain of these neural networks is a compression machine. Papers covered: * Visualizing and Understanding Convolutional Networks (2013) * Feature Visualization (2017) * The Building Blocks of Interpretability (2018) * Zoom In: An Introduction to Circuits (2020) Grab a drink. It’s time to see how the tools were built.

Kommentarer

Vær den første til å kommentere

Registrer deg nå og bli medlem av Drinking with Einstein sitt community!

Kom i gang

Alle episoder

2 Episoder

From Visualizations to Circuits: The Origins of the Glass Box (The Glass Box Pt. 2)

9. feb. 202623 min

Reading the Mind of the Machine (The Glass Box Pt. 1)

We have spent the last decade studying AI like psychologists, observing behavior and guessing at thoughts. It’s time to become neurosurgeons. In this debut episode, AI Engineer Josh Rauvola introduces Mechanistic Interpretability: the emerging science of cracking open neural networks to map the specific circuits inside. We break down how labs like Anthropic, OpenAI, and DeepMind are moving from 'building' models to 'auditing' them, and why this shift is the only way to truly trust the systems we are building

20. jan. 202615 min

From Visualizations to Circuits: The Origins of the Glass Box (The Glass Box Pt. 2)

Beskrivelse

Kommentarer

2 Måneder for 19 kr

Alle episoder