
3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability
The Inside View
00:00
Issoganization and Mazes
There are a bungh of different ways in which this agent generalize. So here's one generalization is it just goes to larger mazes and it doesn't know how to solveit. It just fails to solve big mazes. But then there's another situation, which is itat its capabilities generalize. And if it does the wrong thing, it col be really capably doing the wrong thing in this sort of new envir.
Transcript
Play full episode