
Kubernetes & Cloud Native Trends, with Alain Regnier and Camila Martins
Kubernetes Podcast from Google
00:00
Advancements in Serving Large Language Models on Kubernetes
This chapter covers the introduction of LLMD, an open source project designed to improve the deployment of large language models on Kubernetes. It features discussions on integrating VLLM with the Kubernetes Inference Gateway, focusing on performance boosts and simplified deployment processes.
Transcript
Play full episode