Hear This Idea cover image

#65 – Katja Grace on Slowing Down AI and Whether the X-Risk Case Holds Up

Hear This Idea

00:00

The Problem With AI Learning to Understand Our Values

The challenge is getting these systems to share those values or just care about those things in some robust sense. It seems like thinking of this as it's surviving or not doesn't actually make sense unless it's at the very end of training. How does it come to know which weight changes are good? It has to understand the working of its own mind or something. Also, even if it can, it's not immediately obvious to me that the one where it does the thing you wanted it to do is better for it.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app