TalkRL: The Reinforcement Learning Podcast cover image

John Schulman

TalkRL: The Reinforcement Learning Podcast

CHAPTER

Generalization and Reward Models in Machine Learning

The prompt is, is guiding the model. It's like what corner of the internet do we want to imitate here? And maybe we want to instruct you. So I think generalization, yeah, I think language models generalize quite well. One of the tricky pieces about, uh, RL from human feedback is how, so, so you have this reward model and you're actually training against it.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner