

Deliberative Alignment, And The Spec
Feb 17, 2025
The discussion dives into the challenges of aligning AI with human values, emphasizing the need for moral reflection during training. It highlights the bizarre luck of OpenAI's alignment teams, with tales of mass resignations and tragic events. The conversation also explores the complexities of enhancing AI reasoning to improve understanding of human decisions. Finally, it navigates the future of AI alignment and governance, questioning how varying specifications will impact AI's role in society.
Chapters
Transcript
Episode notes