Abstraction of LLMs on Kubernetes, Cost Considerations, and Pros and Cons of Kubernetes for LLM Training vs. Inferencing

This chapter explores the abstraction of LLMs on top of Kubernetes, comparing the cost of Kubernetes and ALMs and discussing the option of running a cloud player worker using open AI API. The pros and cons of using Kubernetes for LLM training versus inferencing are also addressed, with a focus on velocity optimization and challenges in workload management.

Play episode from 28:01

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app