LessWrong (Curated & Popular) cover image

Critical review of Christiano’s disagreements with Yudkowsky

LessWrong (Curated & Popular)

00:00

Evaluation Challenges and Imitation Learning in Alignment Proposals

This chapter explores the difficulties of evaluating informal natural language arguments and the limitations of imitation learning in capturing irrelevant human behaviors. It also mentions Cristiano's ELK program for an end-to-end solution.

Play episode from 12:01
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app