
BlueDot Narrated Scaling: The State of Play in AI
Sep 9, 2025
Explore the fascinating world of AI scaling laws and how bigger models with more data and compute lead to remarkable advancements. Discover the difference between general models and specialized datasets, illustrated by examples like Bloomberg GPT and GPT-4. Learn about the rising costs of frontier training and the innovative classifications of AI models over the years. Delve into the unique features of leading models like Claude, Gemini 1.5 Pro, and Grok 2, along with the exciting introduction of a new inference 'thinking' scaling law.
AI Snips
Chapters
Books
Transcript
Episode notes
Scale Drives Capability
- Larger models with more parameters and training data tend to be more capable across many tasks.
- Scale correlates with better performance, benchmark scores, and general intelligence-like behavior.
Generational Jumps Cost Orders Of Magnitude
- Improving frontier models requires roughly tenfold increases in data and compute to gain major capability jumps.
- Each generation costs an order of magnitude more compute and money to train.
Bloomberg GPT vs GPT-4 Example
- Bloomberg built Bloomberg GPT with 200 zeta flops of compute but GPT-4 still outperformed it.
- The lesson: domain data helps, but sheer model scale often wins.



