Optimizing Reasoning Model Training

This chapter explores the comprehensive process of training a reasoning model using a refined dataset, emphasizing the selection and filtering of high-quality questions. The speakers discuss the evolving nature of the Gemini thinking model, highlighting the importance of incorrect answers in fostering better reasoning skills. They introduce innovative concepts like budget forcing and token-based guidance to improve model performance during inference generation.

Play episode from 09:19

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app