
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Hear This Idea
00:00
Evolution and the Future of Genetic Fitness
The policy that evolution is selecting for is only trained on real data. What you can do in a computer is use some learned model of the world to generate other hypothetical data about what the effects would be of different actions or policies. You can like counterfactual. And so if you're in a situation where X always has been a good proxy for inclusive genetic fitness, but given a pretty simple understanding of the world, you can recognize that there are certain circumstances where it no longer would be.
Play episode from 01:45:29
Transcript


