
3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability
The Inside View
00:00
Is It in the Right Side for You?
Ongere e: I don't think we currently are able to quite get there, but i am hoping that we will aly be able to actually answer such questions by literally looking the inside of our models. So we can look inside of models, if we have the good enough transparency jewels, and discover how do they work? Are they doing in opposition? Angrama, are they not teing an opponisition? Ongere E: That is something i hope we can eventually do. And botinis js just says, generalize according to whatever objective you learn.
Transcript
Play full episode