Scaling: The State of Play in AI

Sep 9, 2025

Explore the fascinating world of AI scaling laws and how bigger models with more data and compute lead to remarkable advancements. Discover the difference between general models and specialized datasets, illustrated by examples like Bloomberg GPT and GPT-4. Learn about the rising costs of frontier training and the innovative classifications of AI models over the years. Delve into the unique features of leading models like Claude, Gemini 1.5 Pro, and Grok 2, along with the exciting introduction of a new inference 'thinking' scaling law.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Scale Drives Capability

Larger models with more parameters and training data tend to be more capable across many tasks.
Scale correlates with better performance, benchmark scores, and general intelligence-like behavior.

INSIGHT

Generational Jumps Cost Orders Of Magnitude

Improving frontier models requires roughly tenfold increases in data and compute to gain major capability jumps.
Each generation costs an order of magnitude more compute and money to train.

ANECDOTE

Bloomberg GPT vs GPT-4 Example

Bloomberg built Bloomberg GPT with 200 zeta flops of compute but GPT-4 still outperformed it.
The lesson: domain data helps, but sheer model scale often wins.

Get the Snipd Podcast app to discover more snips from this episode

Get the app