
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope
LessWrong (Curated & Popular)
00:00
Ontology Identification for a C T L
Onpology identification is unattractive in some ways, but this post isn't meant to argue against that framing. humans provide tons of evidence about alignment by virtue of containing guaranteed to exist, isms which produce their values around diamonds. One time i didn't look for the human mechanism. Back in 20 18, i had a clever, seeming idea, we don't know how to build in a line dayi. We want multiple tries. It would be great if we could build an a i which knows it may have been incorrectly designed. So why not have the a i simulate its probable design environment over many mis specifications, and then not do plans, which tend to be horrible for most
Play episode from 19:41
Transcript


