
"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky
LessWrong (Curated & Popular)
00:00
How an AI Avoids Pewter and POUDA
An AI is dangerous if it's powerful, like it has the ability to disempower humans. It aims perhaps as a side effect of aiming at something else at CIS,. This is a weaker condition than maximizes utility according to some relatively simple utility function of states of the world. Avoiding pewter doesn't necessarily require fully or perfectly internalizing some corrugibility core in quotes. Holdin thinks that there may be alternative approaches to training AI systems that are powerful enough to do things that help us a lot. Nate disagrees with this. He thinks there is a deep tension between the first two points. Hitting the tension isn't necessarily impossible, but most people just don't seem
Play episode from 02:21
Transcript


