LessWrong (Curated & Popular) cover image

"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope

LessWrong (Curated & Popular)

00:00

The Ontology Identification Problem Framing

How do we build ais which help people? Asking, does c i r l solve corrigibility is hilariously unjustified by what evidence have we located such a specific question? We have assumed there is an achievable corrigibility like property. We have assumed it is good in a similar way as helping people. But this is not the first question to ask when considering sometimes people want to help each other. Much better to start with existing generally intelligent systems humans who already act in the way you want. And ask after the guarantee to exist reason why this empirical phenomenon happens. Many human minds do care about diamonds. There's a guaranteed cause story for humans valuing diamonds, and not

Play episode from 05:15
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app