Mixture of Experts cover image

Episode 40: DeepSeek facts vs hype, model distillation, and open source competition

Mixture of Experts

00:00

The Resurgence of Reinforcement Learning in Model Training

This chapter examines the intricacies of reinforcement learning within the framework of the DeepSeek project, highlighting two distinct training methodologies. It further investigates the impact of these approaches on model training and the potential of RL to enhance reasoning in smaller models.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app