Machine Learning Street Talk (MLST) cover image

Is ChatGPT an N-gram model on steroids?

Machine Learning Street Talk (MLST)

00:00

Exploring Transformer Training Dynamics and Neural Network Representations

This chapter delves into the Chinchilla optimizer, Atom with weight decay, and its implications on transformer training dynamics and template matching. It also discusses the representation and generalization capabilities of neural networks, likening them to hash tables that learn from exemplars.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app