AXRP - the AI X-risk Research Podcast cover image

8 - Assistance Games with Dylan Hadfield-Menell

AXRP - the AI X-risk Research Podcast

00:00

Inference

So it seems like maybe during inference, you, let's say you radnally sample ten reward functions to get their relative likelihoods. And the the rard functions have different like, constante added, like the reward of every single state. If i, like, take the expectation over those, then it's sort of like taking the expectation if all the constants were zero, and then adding, like, the expectation of the consttants right? Because expectations are linear. So wouldn't that not affect how you choose between different actions? In theory, no, and with enough samples,. No, because those averages cancell out even with ten sample, even with taso. Let

Play episode from 01:41:27
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app