Fragmented - AI Developer Podcast cover image

303 - How LLMs Work - the 20 minute explainer

Fragmented - AI Developer Podcast

00:00

Inference: Predicting the Next Token

Iury and Kaushik explain inference as generating probability distributions, sampling (top-k/top-p), and looping.

Play episode from 18:28
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app