
FFP EP. 9 | China’s AI Breakthrough, Time Crystals, Hidden Viruses, & Brightest Cosmic Signal
From First Principles
00:00
DeepSeek's Training Tricks: Reinforcement Learning
Krishna explains DeepSeek's use of pure reinforcement learning to let the model autonomously develop reasoning strategies.
Play episode from 18:36
Transcript


