
589: The London Air
Upgrade
00:00
Explaining Model Scale and Mixture of Experts
The hosts and Federico analogy clarify parameter scale, mixture-of-experts routing, and model memory implications.
Play episode from 56:05
Transcript

The hosts and Federico analogy clarify parameter scale, mixture-of-experts routing, and model memory implications.