

The Data Factory: Inside the $100B Race for Post-Training Supremacy, with Labelbox CEO Manu Sharma
170 snips Jul 8, 2025
In this engaging discussion, Manu Sharma, Founder and CEO of Labelbox—known for providing cutting-edge training data to AI labs—explores the evolution of AI training methods. He highlights the shift from simple labeling to sophisticated reinforcement learning environments. Sharma reveals how AI labs are investing massively in training data and discusses the nuances of post-training strategies. He shares insights on the competitive AI landscape, the importance of human data, and the interplay of creativity and AI in modern industries.
AI Snips
Chapters
Books
Transcript
Episode notes
Billions Spent on Specialized AI Data
- Frontier AI labs now spend over a billion dollars annually on specialized training data for advanced tasks.
- The shift from supervised to reinforcement learning reflects the growing complexity and specialization of AI training data.
Post-training Emphasizes Reinforcement Learning
- Post-training budgets increasingly focus on reinforcement learning for skill-specific tasks like coding and math.
- Models are tested with verifiable rewards, which enables faster improvement in reasoning and coding.
Human Data Anchors AI Alignment
- Human expert data anchors AI alignment by providing quality judgments where right answers aren't known.
- Reinforcement learning setups increasingly use graders and rubrics instead of step-by-step human reasoning traces.