Hear This Idea cover image

#66 – Michael Cohen on Input Tampering in Advanced RL Agents

Hear This Idea

00:00

Evolution's Failure to Optimize Human Policies for Sperm Banks

Evolution hasn't had time to ensure that human policies are optimized for an environment in which sperm banks exist. Until recently, just like having a lot of sex and being kind of attractive, it was a pretty good heuristic for... Right. And so it's not just a slowness and a data issue. There's a whole class of ways that you can make targeted refinements to a policy by making sure that they're robust to various other possibilities - even if they haven't formed the bulk of the training environment.

Play episode from 01:48:12
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app