

(Voiceover) Tülu 3: The next era in open post-training
Nov 21, 2024
Dive into the fascinating evolution of open post-training for language models! Discover how techniques like direct preference optimization are reshaping the landscape post-chatGPT. The conversation unveils innovative methodologies such as scaling prompts and the role of reinforcement learning with verifiable rewards. Get a sneak peek into future developments aimed at enhancing open weight models, and see how this competitive drive is pushing the boundaries of what AI can achieve!
Chapters
Transcript
Episode notes