(Voiceover) Tülu 3: The next era in open post-training

Nov 21, 2024

Dive into the fascinating evolution of open post-training for language models! Discover how techniques like direct preference optimization are reshaping the landscape post-chatGPT. The conversation unveils innovative methodologies such as scaling prompts and the role of reinforcement learning with verifiable rewards. Get a sneak peek into future developments aimed at enhancing open weight models, and see how this competitive drive is pushing the boundaries of what AI can achieve!

Ask episode

Chapters

Transcript

Episode notes

Advancements in Open Post-Training for Language Models

00:00 • 6min

Innovations in Model Post-Training Techniques

06:21 • 1min