
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
The Different Types of Reward Learning
The way reinforcement learning works is, um, you have a program called an agent. And then it receives observations and it receives rewards. It tries to learn how it's actions and its past history of Actions lead to future Rewards. That's the RL problem. How do we come up with a program that does that successfully?
Play episode from 13:50
Transcript


