From First Principles cover image

FFP EP. 9 | China’s AI Breakthrough, Time Crystals, Hidden Viruses, & Brightest Cosmic Signal

From First Principles

00:00

DeepSeek's Training Tricks: Reinforcement Learning

Krishna explains DeepSeek's use of pure reinforcement learning to let the model autonomously develop reasoning strategies.

Play episode from 18:36
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app