LessWrong (Curated & Popular)

"How 'Discovering Latent Knowledge in Language Models Without Supervision' Fits Into a Broader Alignment Scheme" by Collin

Jan 12, 2023
Ask episode
Chapters
Transcript
Episode notes