
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope
LessWrong (Curated & Popular)
00:00
The Diamond Maximizer Problem
The problem isn't that it's impossible to specify a mind which cares about diamonds. We already know there are intelligent minds who value diamonds. Clearly, the ginome plus environment jointly specifies certain human beings who end up caring about diamonds. Where is the evidence required to locate these ideas? That's a link. Why should i even find myself thinking about and maximiation and a and chewing machines and utility functions in this situation? Is not that there's no evidence. For example, utility functions ensure the agent cann't be exploited in some dumb ways.
Play episode from 02:41
Transcript


