
3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability
The Inside View
00:00
Robustness
In the robustece nintroversion, we split alignment into outer linement and robustness at the top level. Outer alinement says, is the base objective, like doing the right thing? And then robustness says, becaes it generalize? Well, according to the base objective,. Then we can split that into im objective, bussens and capeo, billiard bustnes. But i think if you're thinking mostly about masop miser, then you're like intental or intental plus our alinement gives yu tsin taehink for thie way of achieving an act.
Transcript
Play full episode