"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Embryology of AI: How Training Data Shapes AI Development w/ Timaeus' Jesse Hoogland & Daniel Murfet

19 snips
Jun 19, 2025
Jesse Hoogland and Daniel Murfet, co-founders of Timaeus, pioneer AI safety with their focus on developmental interpretability based on Singular Learning Theory. They discuss the complex, jagged landscapes of neural networks and how their Local Learning Coefficient can identify critical training phase changes. This innovative approach aims to catch safety issues early, providing a more structured methodology for AI development. Their insights reveal the intricate relationships between training data, model behavior, and alignment, pushing for a principled engineering discipline in AI.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Singularities Shape Loss Landscapes

  • Loss landscapes of neural networks are complex, jagged surfaces full of singularities that can change the model internally without affecting external behavior.
  • This internal change can mask dangerous misalignment, making it hard to distinguish between fundamentally aligned models and deceptive ones.
INSIGHT

Phase Transitions Simplify Interpretability

  • Developmental interpretability uses singular learning theory to find phase transitions during neural network training.
  • These phase transitions act as meaningful units of change, simplifying interpretability by marking key developmental stages in training.
INSIGHT

Unpacking Generalization in AI

  • Generalization typically means predicting well on new samples from the same distribution, while out-of-distribution generalization is much harder.
  • Interpretability aims to explain the underlying algorithm behind good generalization, not just a numerical measure.
Get the Snipd Podcast app to discover more snips from this episode
Get the app