DeepSeek's Training Tricks: Reinforcement Learning

Krishna explains DeepSeek's use of pure reinforcement learning to let the model autonomously develop reasoning strategies.

Play episode from 18:36

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!