

Running Generative AI Models In Production
31 snips Oct 28, 2024
Philip Kiely, an AI infrastructure expert at BaseTen, dives into the complexities of running generative AI models in production. He shares insights on the importance of selecting the right model based on product requirements and discusses key deployment strategies, including architecture and performance monitoring. Challenges like model quantization and the balance between open-source and proprietary models are explored. Philip also highlights future trends such as local inference, emphasizing the need for compliance in sectors like healthcare.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7
Intro
00:00 • 2min
Navigating Open Models in Generative AI
01:41 • 14min
Deploying Generative AI: Architecture and Insights
15:19 • 27min
Innovative AI Model Deployment
41:56 • 7min
Choosing the Right Infrastructure for AI Production
48:38 • 2min
Navigating AI's Future: Trends and Compliance
50:44 • 4min
Navigating the Challenges of AI Model Evaluation
54:58 • 3min