Generally Intelligent cover image

Episode 28: Sergey Levine, UC Berkeley, on the bottlenecks to generalization in reinforcement learning, why simulation is doomed to succeed, and how to pick good research problems

Generally Intelligent

00:00

How to Improve Language Models Through Value-Based Reinforcement

A lot of the ways that people do RL with language models now treats the language models task as a one step problem. But if we're thinking about counterfactuals, that is typically situated in a multi-step process. So I think there's actually a lot of potential to get much more powerful language models with appropriate value-based reinforcement.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner