
"Deep Deceptiveness" by Nate Soares
LessWrong (Curated & Popular)
00:00
The Future of AI
When the AI is young, perhaps all its inclination towards deceptiveness comes from precursor impulses that you can train it to simply shy away from. But as the AI matures, it gets access to more abstract ways to get the benefits of deceptiveness without needing to plow through object level flinches. When more abstract actions that attain some tangible benefit become available, the compunctions that you've baked in can fail to bind to the abstractly represented plan. This particular story is unlikely, implausible, overly specific, etc. I make no claim that the actual reasoning of early nascent AGIs will look anything like this. I expect it to be weirder, more alien, and
Play episode from 21:23
Transcript


