The New Stack Podcast cover image

Why the CNCF's New Executive Director is Obsessed With Inference

The New Stack Podcast

00:00

Kubernetes Primitives Enable Inference Gains

Jonathan describes using Kubernetes networking and routing plugins to route requests to warmed GPU caches and reduce latency.

Play episode from 07:10
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app