The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

CHAPTER

Optimizing Reasoning Model Training

This chapter explores the comprehensive process of training a reasoning model using a refined dataset, emphasizing the selection and filtering of high-quality questions. The speakers discuss the evolving nature of the Gemini thinking model, highlighting the importance of incorrect answers in fostering better reasoning skills. They introduce innovative concepts like budget forcing and token-based guidance to improve model performance during inference generation.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner