Kubernetes Podcast from Google cover image

Kubernetes & Cloud Native Trends, with Alain Regnier and Camila Martins

Kubernetes Podcast from Google

00:00

Advancements in Serving Large Language Models on Kubernetes

This chapter covers the introduction of LLMD, an open source project designed to improve the deployment of large language models on Kubernetes. It features discussions on integrating VLLM with the Kubernetes Inference Gateway, focusing on performance boosts and simplified deployment processes.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app