The Implications of AI Alignment

The field is sometimes called AI alignment because to the extent that an AI system is going to take care to optimize an objective effectively, you need that objective to be aligned with what you actually want. Examining the objective function isn't necessarily enough for a lot of reasons and it's quite contested. We can look inside of the systems and see what features they're picking up on,. If there are internal representations of objectives, we can think about ways of supervising and structuring systems which give them more transparent internal structure than black box optimizer.

Play episode from 06:31

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app