The Super Problem of Outer Optimization Doesn't Lead to Inner Alignment

The first semi outer alligned solutions found in the search ordering of a real world bounded optimization process are not inner alligned. When an outer optimization loop actually produced general ligence, it broke alignment after it turned general and did so relatively late in the game. We don't know how to get any bits of information into the inner system rather than the outer behaviors,. There's no reliable cartesian sensory ground truth, reliable loss function calculator about whether an output is allined in. Some outputs destroy or foll the human operators and produce a different environmental causal chain behind the externally registered loss function.

Play episode from 29:57

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

"AGI Ruin: A List of Lethalities" by Eliezer Yudkowsky

LessWrong (Curated & Popular)

The Super Problem of Outer Optimization Doesn't Lead to Inner Alignment

Preamble:

The AI-powered Podcast Player