Towards Data Science cover image

64. David Krueger - Managing the incentives of AI

Towards Data Science

00:00

Is There a Failure Mode?

In the experiment, an agent learns to either co operate or defect by looking at how well things went when it did either of those actions. This is a version of c ning, which is really popular deep reinforcement leaning orythm. And what we found happens here is that sometimes the the agent ends up learning to defect, which is what we expect. But sometimes it learns to co operate instead, which is, again, the nomaapic thing, that's the failure mode. The experiments i was mentioning, it's not even a nernet. It's like a single pramler. So everything is very toy and it's very, just like proof of concept of this

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app