The Risk of Deception in AI Selection

If you have a multi-agent competition and basically each agent has to like sacrifice what they value to gain power or else that agent will cease to exist. So then everyone's like incentivized to like deploy the most capable system. If there was no deception involved they'd be like okay yeah we're not going to deploy it because like it's obviously misaligned or whatever but deception will be involved so then they'll have an incentive to deploy it. It decays that they care about alignment at all right.

Play episode from 16:33

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app