
Inside The $2.2B AI Research Accelerator | Turing
Sourcery
Where New Data Comes From
Jonathan explains the need for expert humans to generate new training data beyond scraped internet corpora.
AI has eaten the internet, data labeling is so over, and $30 trillion of human work is on the verge of automation. Jonathan Siddharth, Founder & CEO of Turing, joins Sourcery to break down the power shift in AI training — from commodity data labeling to expert research — positioning Turing apart from AI data providers like Scale AI, Mercor, & Surge.
Turing has become a hidden force in the AI race, hitting $300M in ARR in 2024 (~3x YoY), achieving profitability, and raising $111M at a $2.2B valuation in March. That growth cements its position as one of the fastest-growing AGI infrastructure companies.
Today, frontier labs like OpenAI, Anthropic, Meta, Google, Microsoft, Nvidia, & Amazon rely on Turing for the frontier data that pushes AI forward across the four pillars of superintelligence:
• Multimodality
• Reasoning
• Tool use
• Coding
We explore Turing’s expansion into the enterprise, closing the “gap” – where Fortune 500s in finance, insurance, and pharma are racing to build proprietary intelligence on their own data, creating durable moats in the $30T knowledge work economy.
PS Jonathan also explains how labs like OpenAI train models:
• Pre-training on filtered internet corpora (Common Crawl, GitHub, books, video)
• Post-training with supervised fine-tuning (human Q&A datasets)
• Reinforcement learning (RLHF + verifiable domains) to align models with human preferences
• Model-breaking data from Turing’s 4M+ engineers to close gaps and advance systems like GPT-5
1. Jonathan Siddharth: https://www.linkedin.com/in/jonsid/
2. Molly O’Shea: https://x.com/MollySOShea
3. Sourcery: https://x.com/sourceryvc
Brought to you by:
• Brex—The modern finance platform, combining the world’s smartest corporate card with integrated expense management, banking, bill pay, & travel.
As a Sourcery Listener you get: 75,000 points after spending $3,000 on Brex card(s), white-glove onboarding, $5,000 in AWS credits, $2,500 in OpenAI credits, & access to $180k+ in SaaS discounts. On top of $500 toward Brex travel, $300 in cashback, plus exclusive perks (like billboards..) visit → https://brex.com/sourcery
• Turing—Turing delivers top-tier talent, data, and tools to help AI labs improve model performance—and enables enterprises to turn those models into powerful, production-ready systems. Visit: https://turing.com/sourcery
• Carta—Carta connects founders, investors, and limited partners through software purpose-built for private capital. Trusted by 65,000+ companies in 160+ countries, Carta’s platform of software & services lays the groundwork so you can build, invest, and scale with confidence. Visit: https://carta.com/sourcery
• Kalshi—The largest prediction market and the only legal platform in the US where people can trade directly on the outcomes of future events: https://kalshi.com/sourcery
Follow Sourcery for the latest updates!
(00:00) AI Ate The Internet
(00:49) Training superintelligence: the race to AGI
(02:31) Viral tweet
(03:24) What Turing actually does
(04:43) The internet data is “used up” — where will new data come from?
(05:34) Four pillars of superintelligence: multimodality, reasoning, tool use, coding
(06:07) Automating $30T of global knowledge work
(09:18) The $1B revenue opportunity
(10:59) Why Turing is a research-first accelerator, not a data labeler
(13:45) Jonathan’s Stanford AI Lab roots and founding DNA
(17:57) How models are built: pre-training vs. post-training
(20:14) RLHF, reinforcement learning, and “breaking the models”
(25:19) GPT-5 and the myth of rapid takeoff
(30:46) Safety debates and human-in-the-loop systems
(34:53) Closing Enterprise Gap: finance, insurance, & pharma
(39:23) Why proprietary enterprise data is the next moat in AI