
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope
LessWrong (Curated & Popular)
00:00
Human Value Formation Is Too Complex to Understand in Alignment Related Ways
It can be true that the existing minds are too hard for us to understand in ways relevant to alignment. If human value formation were sufficiently complex, with sufficiently many load bearing parts, such that each part drastically affects human alignment properties, then we might instead want to design simpler, human, comprehensible agents and st their allignment properties. But, i mean, come on, imagine an alien visited and told you qot oh, yes, the a i alignment. We knocked that one out.
Play episode from 09:29
Transcript


