
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
How to Expect an Agent to Understand the World
So when the agent is learning how different actions, different sequences of actions would lead to different rewards and coming up with hypotheses, at least as well as us, there are two hypotheses that I can think of that would explain past data equally well. The first is the reward I get is equal to the number that the box displays. And you could imagine others like the reward gets sent down some wire after it's been processed by the optical character recognition program. You could think of a whole class of these, but for now, let's just think of these two to understand the concepts at play.
Play episode from 36:38
Transcript


