The Drawbacks of Reinforcement Learning With Human Feedback

There are things missing in the model, and that's what I've been focused on. One reason is because humans separate knowledge of the world from how to take decisions. In a goal playing system like AlphaGo, for example, you don't need to interact with the rest of the world. The machinery to take that information and turn it into good decisions is very complex. That's why we need really, really large neural nets to do the inference.

Play episode from 14:41

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app