"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Embryology of AI: How Training Data Shapes AI Development w/ Timaeus' Jesse Hoogland & Daniel Murfet

127 snips

Jun 19, 2025

Jesse Hoogland and Daniel Murfet, co-founders of Timaeus, pioneer AI safety with their focus on developmental interpretability based on Singular Learning Theory. They discuss the complex, jagged landscapes of neural networks and how their Local Learning Coefficient can identify critical training phase changes. This innovative approach aims to catch safety issues early, providing a more structured methodology for AI development. Their insights reveal the intricate relationships between training data, model behavior, and alignment, pushing for a principled engineering discipline in AI.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Singularities Shape Loss Landscapes

Loss landscapes of neural networks are complex, jagged surfaces full of singularities that can change the model internally without affecting external behavior.
This internal change can mask dangerous misalignment, making it hard to distinguish between fundamentally aligned models and deceptive ones.

INSIGHT

Phase Transitions Simplify Interpretability

Developmental interpretability uses singular learning theory to find phase transitions during neural network training.
These phase transitions act as meaningful units of change, simplifying interpretability by marking key developmental stages in training.

INSIGHT

Unpacking Generalization in AI

Generalization typically means predicting well on new samples from the same distribution, while out-of-distribution generalization is much harder.
Interpretability aims to explain the underlying algorithm behind good generalization, not just a numerical measure.

Get the Snipd Podcast app to discover more snips from this episode

Get the app