The Innovation of Post-Training in Language Models | 2min snip from Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Dwarkesh Podcast

NOTE

The Innovation of Post-Training in Language Models

The speaker initially embraced scaling and pre-training in language models, but it was only after the release of GPT-3 that they recognized the potential of post-training. They pivoted their work to focus on post-training, realizing that smarter models with less compute could lead to significant advancements. While the current ratio of pre-training to post-training is unbalanced, arguments suggest that models can generate high-quality outputs independently. Embracing a first-principles approach, the speaker sees considerable gains in post-training and anticipates further development in this area with increased computational investment.

00:00

Transcript

Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.