LessWrong (Curated & Popular)

Mechanistically Eliciting Latent Behaviors in Language Models

May 2, 2024
Ask episode
Chapters
Transcript
Episode notes