
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
Evolution's Failure to Optimize Human Policies for Sperm Banks
Evolution hasn't had time to ensure that human policies are optimized for an environment in which sperm banks exist. Until recently, just like having a lot of sex and being kind of attractive, it was a pretty good heuristic for... Right. And so it's not just a slowness and a data issue. There's a whole class of ways that you can make targeted refinements to a policy by making sure that they're robust to various other possibilities - even if they haven't formed the bulk of the training environment.
Play episode from 01:48:12
Transcript


