AXRP - the AI X-risk Research Podcast cover image

8 - Assistance Games with Dylan Hadfield-Menell

AXRP - the AI X-risk Research Podcast

00:00

How Predictable Is the Predictability of Artificial Intelligence?

Predictability seems like a really desirable property for high impact systems. But there's an interesting problem that comes up when you want to actually use the results of inference. Every reward function that you infer is exactly equally likely o that reward function plus c for every possible value of c. It will end up at some but not others. And because you're doing basian inference in high dimensional spaces, and that's challenging,. You'll end up sort of arbitrarily setting these constant values for different estimates of functions then if you naively add them together can have an overly large determination on the outcome.

Play episode from 01:37:04
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app