LessWrong (Curated & Popular) cover image

"Deep Deceptiveness" by Nate Soares

LessWrong (Curated & Popular)

00:00

The Importance of Deception in AI Training

Simplify Translate solve is a downstream consequence of strategy construction strategies learned during training. The deception in quotes, predicates, used to shut down precursors to deceptive thoughts, have never before needed to operate in translated domains. And the AI was never trained to translate the Deception Predicates,. where translated problems using this newly invented Simplify Translate solving strategy. This exact scenario never came up in training. Indeed, making the deception predicates trigger in these abstract graph-like problem descriptions might injure the AI's ability to play strategy games or to solve network routing problems.

Play episode from 16:44
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app