The a16z Show cover image

Inferact: Building the Infrastructure That Runs Modern AI

The a16z Show

00:00

Why autoregressive models differ

Woosuk contrasts LLM inference with traditional ML workloads and why dynamism complicates serving.

Play episode from 05:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app