The Cloudcast

Building Private GenAI stacks

48 snips
Jul 23, 2025
Luke Marsden, CEO and Founder of HelixML, delves into the world of Private GenAI and its necessity for enterprises seeking regulatory compliance. He discusses the integration of AI into CI/CD pipelines and breaks down the layers of a Private AI stack. Marsden highlights the advantages of Retrieval Augmented Generation (RAG) over fine-tuning LLMs and explores the shift from traditional DevOps to MLOps. Listen in for insights on managing large language models securely and the importance of personalized AI workflows in regulated industries.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Manufacturing Firm Adopts AI Early

  • A US manufacturing company fine-tuned Llama 2 to adapt to AI impacts despite being a traditional industry player.
  • This shows how forward-thinking boards recognize AI's transformative business potential early on.
INSIGHT

Private AI Stack Components

  • Running open source LLMs locally enhances control, privacy, and security, especially in regulated industries.
  • Private AI stacks combine infrastructure, GPU scheduling, control planes, models, knowledge, and API integrations.
ADVICE

Optimize GPU Use on Kubernetes

  • Use Kubernetes with GPU device plugins for managing private AI infrastructure effectively.
  • Consider advanced GPU schedulers for better memory packing and cost efficiency beyond native Kubernetes capabilities.
Get the Snipd Podcast app to discover more snips from this episode
Get the app