
Episode 28: Sergey Levine, UC Berkeley, on the bottlenecks to generalization in reinforcement learning, why simulation is doomed to succeed, and how to pick good research problems
Generally Intelligent
00:00
How to Improve Language Models Through Value-Based Reinforcement
A lot of the ways that people do RL with language models now treats the language models task as a one step problem. But if we're thinking about counterfactuals, that is typically situated in a multi-step process. So I think there's actually a lot of potential to get much more powerful language models with appropriate value-based reinforcement.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.