AXRP - the AI X-risk Research Podcast cover image

8 - Assistance Games with Dylan Hadfield-Menell

AXRP - the AI X-risk Research Podcast

00:00

Risk Overse Trajector Optimization for Utility Functions

When you're looking to maxim as reward for, say, the mum reward function, in your hypothesis that minimization is more often determined by the constants in this reward function inference than it is by the actual reward values themselves. In order to fix this problem, what you have to specify is a point that all the reward functions kind of agree on right away so that that constant is the same for everything. And this standardization specifies the fallback behavior when the system has high uncertainty about reward valuations. It's sort of very clearly, sort of at a low level, in the math, telling you here is the point where you put in the predictable part of what the system will do at

Play episode from 01:43:05
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app