Are You Trying to Get Safety Without Alignment?

We don't have a great solution to the alignment problem right now. We're going to control the AGI's motivations. Maybe the AGI is motivated to do something vaguely related to what we were hoping, or maybe not even that. The consensus of everybody in the field is that that's not a great approach. Even if you don't let your AGI access the internet,. What if the next lab down the street or across the world let's their AGIaccess the internet. So boxing doesn't work unless everybody does it. And yeah, even above and beyond that computer security practices are terrible these days. Unless neural network interpretability makes great strides from where it is today. I

Play episode from 52:27

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app