

AI Engineering Podcast
Tobias Macey
This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.
Episodes
Mentioned books

40 snips
Oct 19, 2025 • 1h 6min
Specs, Tests, and Self‑Verification: The Playbook for Agentic Engineering Teams
 Andrew Filev, CEO and founder of ZenCoder, shares his expertise on architecting AI-first engineering workflows. He discusses the evolution from simple autocomplete to truly agentic models and emphasizes the importance of context engineering and verification. Filev details ZenCoder's internal playbook, covering human-in-the-loop strategies and test-driven development. He also explores the balance between human control and model autonomy, predicts self-verification trends, and gives insightful lessons on navigating the challenges of building modern coding systems. 

43 snips
Oct 11, 2025 • 1h 12min
From Probabilistic to Trustworthy: Building Orion, an Agentic Analytics Platform
 In a fascinating discussion, Lucas Thelosen, CEO of Gravity with experience from Looker and Google, and Drew Gillson, AI expert and co-founder of Gravity, dive into their innovative analytics platform, Orion. They explore the shift from probabilistic to deterministic tools for data accuracy and the importance of user-oriented push-based insights. The duo emphasizes context engineering, organizational impact, and the emerging role of 'AI managers' to drive better data literacy. They also share surprising applications of Orion for qualitative analysis at scale. 

18 snips
Oct 7, 2025 • 51min
Building Production-Ready AI Agents with Pydantic AI
 Samuel Colvin, the mastermind behind the Pydantic validation library, shares his journey in creating Pydantic AI—a type-safe framework for AI agents in Python. He discusses the importance of stability and observability, comparing single-agent versus multi-agent systems. Samuel explores architectural patterns, emphasizing minimal abstractions and robust engineering practices. He also addresses code safety and the challenge of model-provider churn, while promoting open standards for enhanced observability. Join him as he reveals insights on crafting reliable AI agents! 

14 snips
Sep 28, 2025 • 55min
From GPUs to Workloads: Flex AI’s Blueprint for Fast, Cost‑Efficient AI
 Brijesh Tripathi, CEO of Flex AI and a former architect at Intel, NVIDIA, Apple, and Tesla, discusses transforming AI workflows by implementing 'workload as a service'. He highlights the importance of minimizing DevOps burdens to enhance productivity, revealing how inconsistent Kubernetes layers create challenges for AI teams. Brijesh elaborates on optimizing training and inference processes and emphasizes Flex AI's focus on easing the complexity of heterogeneous compute while ensuring cost efficiency. His vision aims to empower teams, enabling them to innovate without infrastructure hassles. 

40 snips
Sep 20, 2025 • 51min
Right-Sizing AI: Small Language Models for Real-World Production
 In this discussion, Steven Huels, VP of AI Engineering at Red Hat, unpacks the power of small language models (SLMs) for real-world applications. He highlights the advantages of SLMs in fitting onto single enterprise GPUs and their operational capabilities. The conversation dives into self-hosting models versus relying on APIs, tackles organizational readiness, and discusses innovations in agentic systems. Steven shares real-world examples like scam detection and emphasizes the importance of customization, automated evaluation, and continuous retraining for efficient AI deployment. 

43 snips
Sep 13, 2025 • 54min
AI Agents and Identity Management
 Julianna Lamb, co-founder and CTO of Stytch, delves into identity management in AI, discussing its complexities amidst evolving technologies. She highlights the challenges of permissions and security as AI agents take on human tasks. The conversation covers innovative authentication strategies, including the need for layered verification and adapting systems with robust security. Julianna emphasizes experimenting with AI agents and suggests the importance of feedback mechanisms for seamless integration and optimal performance, all while navigating the future of identity standards. 

16 snips
Sep 4, 2025 • 51min
Revolutionizing Production Systems: The Resolve AI Approach
 In this engaging conversation, Spiros Xanthos, CEO of Resolve AI, shares his vision for revolutionizing operational systems with AI agents. He discusses the limitations of traditional tools and how intelligent agents can enhance troubleshooting. Spiros highlights the importance of context and memory for effective AI integration, as well as the evolving collaboration between humans and AI in production environments. He emphasizes the need for continuous learning to maximize AI's potential, paving the way for more efficient human-machine partnerships and improved user experiences. 

67 snips
Aug 26, 2025 • 1h 14min
Designing Scalable AI Systems with FastMCP: Challenges and Innovations
 Jeremiah Lowin, the founder and CEO of Prefect Technologies, discusses the FastMCP framework, designed to streamline AI tool deployment. He shares insights on the evolution of FastMCP and its role in simplifying the design of MCP servers. The conversation delves into the importance of context engineering and the challenges of authentication in AI systems. Jeremiah emphasizes the need for simplicity in development and addresses potential limitations of FastMCP, while exploring future innovations in the AI landscape. 

14 snips
Aug 23, 2025 • 41min
Proactive Monitoring in Heavy Industry: The Role of AI and Human Curiosity
 Tara Javidi, CTO of KavAI and a researcher specializing in AI and information theory, shares her insights on proactive monitoring in heavy industry. She discusses how her platform harnesses generative AI to mimic human curiosity, improving data collection and predictive analytics. The conversation highlights the integration of AI into existing workflows, building trust with operators, and the potential of AI to prevent environmental catastrophes. Javidi emphasizes the importance of curiosity-driven architectures and their impact on operational efficiency. 

Aug 7, 2025 • 52min
Navigating the AI Landscape: Challenges and Innovations in Retail
 SummaryIn this episode of the AI Engineering Podcast machine learning engineer Shashank Kapadia explores the transformative role of generative AI in retail. Shashank shares his journey from an engineering background to becoming a key player in ML, highlighting the excitement of understanding human behavior at scale through AI. He discusses the challenges and opportunities presented by generative AI in retail, where it complements traditional ML by enhancing explainability and personalization, predicting consumer needs, and driving autonomous shopping agents and emotional commerce. Shashank elaborates on the architectural and operational shifts required to integrate generative AI into existing systems, emphasizing orchestration, safety nets, and continuous learning loops, while also addressing the balance between building and buying AI solutions, considering factors like data privacy and customization.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Shashank Kapadia about applications of generative AI in retailInterviewIntroductionHow did you get involved in machine learning?Can you summarize the main applications of generative AI that you are seeing the most benefit from in retail/ecommerce?What are the major architectural patterns that you are deploying for generative AI workloads?Working at an organization like WalMart, you already had a substantial investment in ML/MLOps. What are the elements of that organizational capability that remain the same, and what are the catalyzed changes as a result of generative models?When working at the scale of Walmart, what are the different types of bottlenecks that you encounter which can be ignored at smaller orders of magnitude?Generative AI introduces new risks around brand reputation, accuracy, trustworthiness, etc. What are the architectural components that you find most effective in managing and monitoring the interactions that you provide to your customers?Can you describe the architecture of the technical systems that you have built to enable the organization to take advantage of generative models?What are the human elements that you rely on to ensure the safety of your AI products?What are the most interesting, innovative, or unexpected ways that you have seen generative AI break at scale?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI?When is generative AI the wrong choice?What are your paying special attention to over the next 6 - 36 months in AI?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksWalmart LabsThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0 


