The Importance of Inner Alignment in AI

The challenge is how do you actually come up with an objective that is not subject to that kind of pathological optimization no matter what objective you encode. The easiest way to make that objective go up might look like hacking the system that stores the objective or convincing humans to do things in the physical world that increase the value of your goal. So this is a pretty fundamental and intrinsic difficulty of encoding what we want into these systems which needs to be stored somewhere. If you apply security mindset there's going to be a way to take those boxes that looks nothing like what we anticipate.

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app