Algorithms That Optimize Over Assistance

The optimal agent has to maintain a probability distribution over all the possible reward functions that alice could have. And as you probably know, fubation, up dating over a la list of hypotheses is very computationaly intractable. Another way of seeing it is that if you take this assistance paradise, you can, through a relatively simple reduction, turn it into a partially observable mark of decision process, or pom dipe.

Play episode from 46:05

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app