Brain Inspired cover image

BI 151 Steve Byrnes: Brain-like AGI Safety

Brain Inspired

00:00

What Is Air Sets Interpretability?

We want to understand what the AGI is thinking in all this full glorious detail. If we got that it would solve all kinds of problems you wouldn't have to worry about the AGI deceiving you. So, if you had full glorious interpretability, and you wanted an AGI that is motivated to be honest, then you know you could catch it lying a few times with perfect reliability. We want the AGI to be motivated enough to lie, but it's not so good if the AGI are merely motivated to not get caught lying. In sort of the same way that the amygdala can learn that something is going to lead to goosebumps,. I think the amygdala, maybe

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app