
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI
Dwarkesh Podcast
The Innovation of Post-Training in Language Models
The speaker initially embraced scaling and pre-training in language models, but it was only after the release of GPT-3 that they recognized the potential of post-training. They pivoted their work to focus on post-training, realizing that smarter models with less compute could lead to significant advancements. While the current ratio of pre-training to post-training is unbalanced, arguments suggest that models can generate high-quality outputs independently. Embracing a first-principles approach, the speaker sees considerable gains in post-training and anticipates further development in this area with increased computational investment.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.