“The Field of AI Alignment: A Postmortem, and What To Do About It” by johnswentworth

Dec 26, 2024

johnswentworth, an insightful author from LessWrong, dissects the current state of AI alignment research. He uses an engaging metaphor about searching for keys under a streetlight to illustrate researchers' focus on easier problems while neglecting existential threats. The conversation shifts towards the urgent need for a recruitment overhaul, advocating for advanced technical skills to foster innovative approaches. Overall, the dialogue challenges existing paradigms and emphasizes tackling the real challenges in AI safety.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Alice and Bob

Alice fails at a hard alignment problem and pivots.
Bob succeeds at an easy problem, creating a misleading selection effect.

ANECDOTE

Sam, Eliezer, and Paul

"Sam", needing an AI safety strategy compatible with racing towards AGI, chooses a convenient model.
This model, though attributed to "Paul", is altered to further support the race.

ANECDOTE

Carol's Flinch

Carol encounters a key difficulty in the ELK problem and briefly considers a harder but more relevant approach.
Overwhelmed by the difficulty, she quickly reverts to easier problems.

Get the Snipd Podcast app to discover more snips from this episode

Get the app