

The Startup Powering The Data Behind AGI
67 snips Sep 16, 2025
Edwin Chen, CEO of Surge AI and a veteran ML engineer from Google, Twitter, and Facebook, dives into the challenges of quality data labeling in AI. He discusses Surge's innovative approach that focuses on high-skill data collection over traditional methods. Chen critiques the notion of inter-annotator agreement, especially in complex tasks like poetry and math. The conversation also touches on the role of high-quality data in advancing artificial general intelligence and the necessary evolution of data practices for future AI breakthroughs.
AI Snips
Chapters
Transcript
Episode notes
Early Craigslist Labeling Failure
- Edwin described labeling at Twitter with two Craigslist hires who produced junk labels and forced him to label himself.
- That experience motivated him to build a better data platform rather than trust ad-hoc spreadsheets and slow workers.
Quality Over Commodity Labeling
- Commodity labeling fits only low-skill tasks like bounding boxes where human variance is minimal.
- High-complexity language and behavior tasks require a platform built for quality, not just scale.
Built From A Trusted Network
- Edwin recruited an initial crowd from his existing network of people who've cared about data labeling for years.
- Early customers were engineers and research scientists at companies like Airbnb and Twitter who demanded higher-quality data.