

“The Field of AI Alignment: A Postmortem, and What To Do About It” by johnswentworth
Dec 26, 2024
johnswentworth, an insightful author from LessWrong, dissects the current state of AI alignment research. He uses an engaging metaphor about searching for keys under a streetlight to illustrate researchers' focus on easier problems while neglecting existential threats. The conversation shifts towards the urgent need for a recruitment overhaul, advocating for advanced technical skills to foster innovative approaches. Overall, the dialogue challenges existing paradigms and emphasizes tackling the real challenges in AI safety.
AI Snips
Chapters
Transcript
Episode notes
Alice and Bob
- Alice fails at a hard alignment problem and pivots.
- Bob succeeds at an easy problem, creating a misleading selection effect.
Sam, Eliezer, and Paul
- "Sam", needing an AI safety strategy compatible with racing towards AGI, chooses a convenient model.
- This model, though attributed to "Paul", is altered to further support the race.
Carol's Flinch
- Carol encounters a key difficulty in the ELK problem and briefly considers a harder but more relevant approach.
- Overwhelmed by the difficulty, she quickly reverts to easier problems.