AI Safety Fundamentals: Alignment cover image

The Easy Goal Inference Problem Is Still Hard

AI Safety Fundamentals: Alignment

00:00

The Ambitious Value Learning Approach to AI Safety

Paul Christiano: In order to infer preferences that can lead to superhuman performance, it is necessary to understand how humans are biased. This approach has the major advantage that we can begin empirical work today. We can actually build systems which observe user behavior and try to figure out what the user wants,. There are many applications that people care about already and we can set to work on making rich toy models. It seems great to develop these capabilities in parallel with other AI progress.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app