LessWrong (Curated & Popular) cover image

"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope

LessWrong (Curated & Popular)

00:00

Human Value Formation Is Too Complex to Understand in Alignment Related Ways

It can be true that the existing minds are too hard for us to understand in ways relevant to alignment. If human value formation were sufficiently complex, with sufficiently many load bearing parts, such that each part drastically affects human alignment properties, then we might instead want to design simpler, human, comprehensible agents and st their allignment properties. But, i mean, come on, imagine an alien visited and told you qot oh, yes, the a i alignment. We knocked that one out.

Play episode from 09:29
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app