

Deploy and fine-tune LLM models on Kubernetes using KAITO
Aug 7, 2024
Sachi Desai, a Product Manager specializing in AI technologies, and Paul Yu, a Senior Cloud Advocate at Microsoft, dive into the KAITO project for deploying open source LLM models on Kubernetes. They discuss how KAITO simplifies running AI applications alongside LLM models and enables users to bring and fine-tune their own models. The conversation highlights innovative techniques like LoRa and Q-LoRa for efficient model training. Additionally, they emphasize community engagement's role in enhancing AI model deployment and future capabilities.
Chapters
Transcript
Episode notes
1 2 3 4 5 6
Intro
00:00 • 2min
Kubernetes and AI: Integrating Kaito for Enhanced Workloads
01:56 • 23min
Optimizing LLM Fine-Tuning with Kaito
24:59 • 10min
Engaging Community Feedback for Enhanced AI Model Deployment
35:02 • 2min
Enhancing AI with RAG and Community Contributions
37:05 • 4min
Understanding KAITO and Fine-Tuning Language Models on AKS
41:17 • 3min