
Victoria Krakovna–AGI Ruin, Sharp Left Turn, Paradigms of AI Alignment
The Inside View
00:00
Navigating AI Alignment Challenges
This chapter examines the complexities involved in maintaining goal alignment for AI models during significant capability transitions. It highlights the critical steps of model identification and the challenges in ensuring human intentions align with model objectives, including issues of interpretability and communication. The chapter introduces key concepts such as outer and inner alignment, along with the distinctions among different specifications of AI systems and the potential gaps that can lead to misalignment.
Transcript
Play full episode