KubeFM cover image

KubeFM

Latest episodes

undefined
Mar 18, 2025 • 52min

Saving 10s of thousands of dollars deploying AI at scale with Kubernetes, with John McBride

Curious about running AI models on Kubernetes without breaking the bank? This episode delivers practical insights from someone who's done it successfully at scale.John McBride, VP of Infrastructure and AI Engineering at the Linux Foundation shares how his team at OpenSauced built StarSearch, an AI feature that uses natural language processing to analyze GitHub contributions and provide insights through semantic queries. By using open-source models instead of commercial APIs, the team saved tens of thousands of dollars.You will learn:How to deploy VLLM on Kubernetes to serve open-source LLMs like Mistral and Llama, including configuration challenges with GPU drivers and daemon setsWhy smaller models (7-14B parameters) can achieve 95% effectiveness for many tasks compared to larger commercial models, with proper prompt engineeringHow running inference workloads on your own infrastructure with T4 GPUs can reduce costs from tens of thousands to just a couple thousand dollars monthlyPractical approaches to monitoring GPU workloads in production, including handling unpredictable failures and VRAM consumption issuesSponsorThis episode is brought to you by StackGen! Don't let infrastructure block your teams. StackGen deterministically generates secure cloud infrastructure from any input - existing cloud environments, IaC or application code.More infoFind all the links and info for this episode here: https://ku.bz/wP6bTlrFsInterested in sponsoring an episode? Learn more.
undefined
Mar 4, 2025 • 31min

I just want mTLS on Kubernetes, with John Howard

Dive into the world of Kubernetes security with this insightful conversation about securing cluster traffic through encryption.John Howard, Senior Software Engineer at Solo.io, explains the complexities of implementing Mutual TLS (mTLS) in Kubernetes. He discusses the evolution from DIY approaches to Service Mesh solutions, focusing on Istio's Ambient Mesh as a simplified path to workload encryption.You will learn:Why DIY mTLS implementation in Kubernetes is challenging at scale, requiring certificate management, application updates, and careful transition planningHow Service Mesh solutions offload security concerns from applications, allowing developers to focus on business logic while infrastructure handles encryptionThe advantages of Ambient Mesh's approach to simplifying mTLS implementation with its node proxy and waypoint proxy architectureSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/sk-ZF1PG9Interested in sponsoring an episode? Learn more.
undefined
Feb 25, 2025 • 32min

Learned it the hard way: don't use Cilium's default Pod CIDR, with Isala Piyarisi

This episode examines how a default configuration in Cilium CNI led to silent packet drops in production after 8 months of stable operations.Isala Piyarisi, Senior Software Engineer at WSO2, shares how his team discovered that Cilium's default Pod CIDR (10.0.0.0/8) was conflicting with their Azure Firewall subnet assignments, causing traffic disruptions in their staging environment.You will learn:How Cilium's default CIDR allocation can create routing conflicts with existing infrastructureA methodical process for debugging network issues using packet tracing, routing table analysis, and firewall logsThe procedure for safely changing Pod CIDR ranges in production clustersSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/kJjXQlmTwInterested in sponsoring an episode? Learn more.
undefined
Feb 18, 2025 • 33min

Simplifying Kubernetes deployments with a unified Helm chart, with Calin Florescu

Managing microservices in Kubernetes at scale often leads to inconsistent deployments and maintenance overhead. This episode explores a practical solution that standardizes service deployments while maintaining team autonomy.Calin Florescu discusses how a unified Helm chart approach can help platform teams support multiple development teams efficiently while maintaining consistent standards across services.You will learn:Why inconsistent Helm chart configurations across teams create maintenance challenges and slow down deploymentsHow to implement a unified Helm chart that balances standardization with flexibility through override functionsHow to maintain quality through automated documentation and testing with tools like Helm Docs and Helm unittestSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/mcPtH5395Interested in sponsoring an episode? Learn more.
undefined
Feb 4, 2025 • 22min

5,000 pods/second and 60% utilization with Gödel and Katalyst, with Yue Yin

Learn how ByteDance manages computing resources at scale with custom Kubernetes scheduling solutions that handle millions of pods across thousands of nodes.Yue Yin, Software Engineer at ByteDance, discusses their open-source Gödel scheduler and Katalyst resource management system. She explains how these tools address the challenges of managing online and offline workloads in large-scale Kubernetes deployments.You will learn:How Gödel's distributed architecture with dispatcher, scheduler, and binder components enables the scheduling of 5,000 pods per secondWhy NUMA-aware scheduling and two-layer architecture are crucial for handling complex workloads at scaleHow Katalyst provides node-level resource insights to enable efficient workload co-location and improve CPU utilizationSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/lMpNng_33Interested in sponsoring an episode? Learn more.
undefined
Jan 28, 2025 • 33min

Black box vs white box observability in Kubernetes, with Artem Lajko

Platform Engineer Artem Lajko breaks down observability into three distinct layers and explains how tools like Prometheus, Grafana, and Falco serve different purposes. He also shares practical insights on implementing the right level of monitoring based on team requirements and capabilities.You will learn:How to implement the three-layer model (external, internal, and OS-level) and why each layer serves different stakeholdersHow to choose and scale observability tools using a label-based approach (low, medium, high)How to manage observability costs by collecting only relevant metrics and logsSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/9sGxhmm8sInterested in sponsoring an episode? Learn more.
undefined
Jan 21, 2025 • 45min

Topology-aware routing: balancing cost savings and reliability, with William Morgan

In this episode, William Morgan, CEO of Buoyant, explores the complex trade-offs between cost optimization and reliability in Kubernetes networking. The discussion focuses on Topology-aware routing and why its implementation might not be the silver bullet for managing cross-zone traffic costs.William shares practical insights from real-world implementations and explains why understanding these trade-offs is crucial for platform teams managing multi-zone Kubernetes clusters.You will learn:How Topology-aware routing attempts to reduce cross-zone traffic costs but can compromise reliability by limiting inter-zone communicationWhy Layer 7 load balancing offers better traffic management through protocol awareness compared to topology-aware routing's Layer 4 approachHow HAZL (High Availability Zonal Load Balancing) provides a more nuanced solution by balancing cost savings with reliability guarantees through intelligent traffic routingSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/CBwn51pl-Interested in sponsoring an episode? Learn more.
undefined
Jan 14, 2025 • 50min

Which Kubernetes PostgreSQL operator should you choose?, with David Pech

Are you running PostgreSQL on Kubernetes and need to choose the right operator? In this episode, David Pech, Staff Cloud Ops Engineer, shares his experience implementing database platforms on Kubernetes and guides teams through operator selection and platform requirements.You will learn:The core requirements for a PostgreSQL platform on Kubernetes, including autopilot capabilities, security practices, and observabilityHow to evaluate PostgreSQL operators based on their architecture — from single-instance deployments to cloud-native implementationsWhat teams should consider before building their own database-as-a-service and common pitfalls to avoidThe distinction between being production-ready (running single instances) versus platform-ready (operating at scale with proper tooling)SponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/rGMF2ktdbInterested in sponsoring an episode? Learn more.
undefined
Dec 10, 2024 • 46min

Exploring multi-tenancy for my Kubernetes learning platform, with Stefan Roman

Stefan Roman shares his experience building Labs4Grabs, a platform that gives students root access to Kubernetes clusters. He discusses the journey from evaluating simple namespace-based isolation to implementing full VM-based isolation with KubeVirt.You will learn:Why namespace isolation isn't sufficient for untrusted users and the limitations of tools like vCluster when running privileged workloads.How to use KubeVirt to achieve complete workload isolation and the trade-offs.Practical approaches to implementing network security with NetworkPolicies and managing resource allocation across multiple student environments.Follow Stefan's journey from simple to complex isolation strategies, focusing on the technical decisions and trade-offs he encountered.SponsorThis episode is sponsored by Kusari — gain complete visibility into your software components and secure your supply chain through comprehensive tracking and analysis.More infoFind all the links and info for this episode here: https://ku.bz/Xz-TrmX2FInterested in sponsoring an episode? Learn more.
undefined
Dec 3, 2024 • 47min

Optimize the Kubernetes dev experience by creating silos, with Michael Levan

Michael Levan explains how specialized teams and smart abstractions can lead to better outcomes. Drawing from cognitive science and his experience in platform engineering, Michael presents practical strategies for building effective engineering organizations.You will learn:Why specialized teams (or "silos") can improve productivity and why the real enemy is ego, not specialization.How to use Internal Developer Platforms (IDPs) and abstractions to empower teams without requiring everyone to be a Kubernetes expert.How to balance specialization and collaboration using platform engineering practices and smart abstractionsPractical strategies for managing cognitive load in engineering teams and why not everyone needs to know YAML.SponsorThis episode is brought to you by Testkube — scale all of your tests with Kubernetes, integrate seamlessly with CI/CD and centralize test troubleshooting and reporting.More infoFind all the links and info for this episode here: https://ku.bz/qlZPfM-zrInterested in sponsoring an episode? Learn more.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode