
Rohin Shah
TalkRL: The Reinforcement Learning Podcast
Inter Active Reward Learning
Inversoro paradime assumes that we were not uncertain in the end. So there can be uncertainty, but like the human isn't in the environment. The inter active reward learning setting is mostly just a thing we made up because it was a thing that people often brought up as like, maybe this will work. We wanted to talk about it and show why it doesn't, doesn't, in fact works.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.