Hear This Idea cover image

#65 – Katja Grace on Slowing Down AI and Whether the X-Risk Case Holds Up

Hear This Idea

00:00

The Problem With AI Learning to Understand Our Values

The challenge is getting these systems to share those values or just care about those things in some robust sense. It seems like thinking of this as it's surviving or not doesn't actually make sense unless it's at the very end of training. How does it come to know which weight changes are good? It has to understand the working of its own mind or something. Also, even if it can, it's not immediately obvious to me that the one where it does the thing you wanted it to do is better for it.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app