AI Safety Fundamentals: Alignment

Deceptively Aligned Mesa-Optimizers: It’s Not Funny if I Have to Explain It

May 13, 2023
Ask episode
Chapters
Transcript
Episode notes