TalkRL: The Reinforcement Learning Podcast cover image

Rohin Shah

TalkRL: The Reinforcement Learning Podcast

00:00

Learning From Human Feedback Is Better Than Reward Learning

Ronan Fergusson: I can point to this one as a something surprising to me. It's like towards ad decision theoretic model of assistance, or something like that. And then there's also co operative inverse reenforcement learning from chi. The idea with this paper was just to take te models that had already been proposed in these papers and explain why they were so nice. Is supposed to am other things that the field could be doing.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app