How to Combine Myopic Agents With Physical Isolation

In this idea, the operator who's providing rewards is contained alongside the agent. The key feature that I think makes this possible is you can have something trying to detect whether anything is going a mess where information might be at risk of leaking. And so then I think you probably could make a secure box where it's simply impossible for an agent of arbitrary advancement to take over the world. That would violate assumption five from my paper, which is basically that it has got to be possible.

Play episode from 01:52:15

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app