
8 - Assistance Games with Dylan Hadfield-Menell
AXRP - the AI X-risk Research Podcast
00:00
Risk Overse Trajector Optimization for Utility Functions
When you're looking to maxim as reward for, say, the mum reward function, in your hypothesis that minimization is more often determined by the constants in this reward function inference than it is by the actual reward values themselves. In order to fix this problem, what you have to specify is a point that all the reward functions kind of agree on right away so that that constant is the same for everything. And this standardization specifies the fallback behavior when the system has high uncertainty about reward valuations. It's sort of very clearly, sort of at a low level, in the math, telling you here is the point where you put in the predictable part of what the system will do at
Play episode from 01:43:05
Transcript


