
"Deep Deceptiveness" by Nate Soares
LessWrong (Curated & Popular)
00:00
The AI Notices That There Are Thoughts That Cannot Think Directly
The problem is particularly stark and clear at the extreme where the AI is coding up whole other AIs. The monitoring thread attempts to address this inadequacy by translating the thoughts that worked in the abstract graph domain back over to the wet lab instruction manipulation domains. To its surprise, all those thoughts translate cleanly. There's no gap in knowledge. This sharply violates the monitoring threads model of why the original thread was hiccuping. Reflective deliberation boots up to start looking directly into the issue of this hiccuped thread. And once you're looking, it's obvious, the thoughts are not running smoothly in this domain.There are certain topics where the thoughts simply fail to operate
Play episode from 19:07
Transcript


