The Misaligned AI System

I think one of the main interficience people often have an alignment for why alignment is hard is truly if you have this misaligned AI system it's like it seems really impossible to distinguish that from the truth. In a line AI system because the mis aligned AI system could be actively lying and superhuman so you can't tell when it's lying and so on. So I think intuitively this should feel easier because you have access both to the misaligned system and also the truth or something like this. It's not as worst case as some some types of misaligned AI systems you can run into with an alignment but it's actually less adversarial than many settings like this.

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app