
8 - Assistance Games with Dylan Hadfield-Menell
AXRP - the AI X-risk Research Podcast
00:00
How Predictable Is the Predictability of Artificial Intelligence?
Predictability seems like a really desirable property for high impact systems. But there's an interesting problem that comes up when you want to actually use the results of inference. Every reward function that you infer is exactly equally likely o that reward function plus c for every possible value of c. It will end up at some but not others. And because you're doing basian inference in high dimensional spaces, and that's challenging,. You'll end up sort of arbitrarily setting these constant values for different estimates of functions then if you naively add them together can have an overly large determination on the outcome.
Play episode from 01:37:04
Transcript


