The Nonlinear Library: LessWrong

LW - What and Why: Developmental Interpretability of Reinforcement Learning by Garrett Baker

Jul 9, 2024
Ask episode
Chapters
Transcript
Episode notes