Alex Spies - Structured Representations in Maze-Solving Transformers

59 min · 1. kesä 2024

Kuvaus

In this discussion, Alex Spies will provide an overview of mechanistic interpretability tools and the approaches researchers employ to "reverse engineer" transformer models. He will then explain how his team used some of these techniques to uncover emergent structures in the models they trained and how these structures may facilitate a systematic understanding of internal search processes. What guarantees can Mechanistic Interpretability provide for logic-based programs (if any)? Alex is a PhD student at Imperial College London who is currently in Tokyo as a Research Fellow at the National Institute of Informatics (NII).

Kommentit

Ole ensimmäinen kommentoija

Rekisteröidy nyt ja liity Mindful Machines-yhteisöön!

Aloita maksutta

Kaikki jaksot

3 jaksot

Manuel Baltieri - Agentic systems and animism in Japan

In this episode, we invited Manuel Baltieri, Chief Researcher in Foundations of Agency, Life and Cognition at Araya. Manuel’s research interests include formal definitions of agency, life and cognition. Specifically, his work focuses on questions regarding the differences between agentic vs. non-agentic systems, living vs. non-living organisms, and cognitive vs. non-cognitive architectures. During this discussion, Manuel will talk about his journey in Japan's AI Safety scene and will provide an overview of agentic systems, animism in Japan, and opportunities to keep your eye out for if you're considering working or researching in Japan.

5. syys 20241 h 7 min

Welcome to MM

Artificial intelligence is here, and it’s changing our world. Let’s be Mindful about it.

1. kesä 20241 min

Alex Spies - Structured Representations in Maze-Solving Transformers

1. kesä 202459 min

Alex Spies - Structured Representations in Maze-Solving Transformers

Kuvaus

Kommentit

14 vrk ilmainen kokeilu

Kaikki jaksot