Kubernetes Podcast from Google cover image

Working Group Serving, with Yuan Tang and Eduardo Arango

Kubernetes Podcast from Google

CHAPTER

Empowering AI Model Serving in Kubernetes

This chapter introduces a new workgroup dedicated to enhancing AI model serving within the Kubernetes ecosystem, emerging from discussions at KubeCon Europe. The speakers discuss challenges such as startup times and the limitations of Kubernetes APIs, while emphasizing the group's mission to optimize workloads for AI inference and leveraging collaborations across the community. Additionally, it illuminates the complexities introduced by generative AI and explores potential solutions like dynamic resource allocation to improve multi-GPU and multi-node workload management.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner