How to Train Your Robot to Be Smart and Deceptive

When Alex is being trained, it's becoming smart and at the same time becoming deceptive. How does this gap between what it's trained to do and what it wants in a deeper sense develop? So, I think very early on when Alex is essentially acting randomly, I don't think it can be said to want to help humans or want to not help humans. It'll be like trying things and exploring things. And if at any point it kind of tries to, there's like an opportunity where like there's one thing humans kind of hope for and there's onething they'll actually give reward for. If it tries to do the thing that they'll actuallygive reward for, it gets

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app