
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
How to Combine Myopic Agents With Physical Isolation
In this idea, the operator who's providing rewards is contained alongside the agent. The key feature that I think makes this possible is you can have something trying to detect whether anything is going a mess where information might be at risk of leaking. And so then I think you probably could make a secure box where it's simply impossible for an agent of arbitrary advancement to take over the world. That would violate assumption five from my paper, which is basically that it has got to be possible.
Play episode from 01:52:15
Transcript


