
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
The Cost of a Comment Defense System
Actions are judged according to how well they maximize the number of the camera sees. If it can increase the probability that it gets maximal reward for the next million years from one minus 10 to the negative six, it'll pick the actions that do that. But what if a comment comes? I guess there's some trade off with how closely this is. It's unclear how you'd build that into an agent.
Play episode from 54:18
Transcript


