Gradient Dissent: Conversations on AI cover image

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

Gradient Dissent: Conversations on AI

00:00

How modern LLM inference works and optimizations

Tuhin breaks inference into infrastructure and runtime problems and key metrics like time-to-first-token and throughput.

Play episode from 29:50
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app