In this podcast, Manjot Pahwa, Rahul Parundekar, and Patrick Barker discuss the integration of Kubernetes and large language models (LLMs), the challenges of using Kubernetes for data scientists, and the considerations for hosting LMM applications in production. They also explore the abstraction of LLMs on Kubernetes, the cost considerations, and the pros and cons of using Kubernetes for LLM training versus inferencing. Additionally, they touch on using Kubernetes for real-time online inferences and the availability of abstractions like Metaplow.