The age of ubiquitous AI agents is here, bringing immense potential - and unprecedented risk.
Hosts Conor Bronsdon and Vikram Chatterji open the episode by discussing the urgent need for building trust and reliability into next-generation AI agents. Vikram unveils Galileo's free AI reliability platform for agents, featuring Luna 2 SLMs for real-time guardrails and its Insights Engine for automatic failure mode analysis. This platform enables cost-effective, low-latency production evaluations, significantly transforming debugging. Achieving trustworthy AI agents demands rigorous testing, continuous feedback, and robust guardrailing—complex challenges requiring powerful solutions from partners like Elastic.
Conor welcomes Philipp Krenn, Director of Developer Relations at Elastic, to discuss their collaboration in ensuring AI agent reliability, including how Elastic leverages Galileo's platform for evaluation. Philipp details Elastic's evolution from a search powerhouse to a key AI enabler, transforming data access with Retrieval-Augmented Generation (RAG) and new interaction modes. He discusses Elastic's investment in SLMs for efficient re-ranking and embeddings, emphasizing robust evaluation and observability for production. This collaborative effort aims to equip developers to build reliable, high-performing AI systems for every enterprise.
Chapters:
00:00 Introduction
01:09 Galileo's AI Reliability Platform
01:43 Challenges in AI Agent Reliability
06:17 Insights Engine and Its Importance
11:00 Luna 2: Small Language Models
14:42 Custom Metrics and Agent Leaderboard
19:16 Galileo's Integrations and Partnerships
21:04 Philipp Krenn from Elastic
24:47 Optimizing LLM Responses
25:41 Galileo and Elastic: A Powerful Partnership
28:20 Challenges in AI Production and Trust
30:02 Guardrails and Reliability in AI Systems
32:17 The Future of AI in Customer Interaction
Follow the hosts
Follow Atin
Follow Conor
Follow Vikram
Follow Yash
Follow Today's Guest(s)
Connect with Philipp on LinkedIn
Learn more about Elastic
Check out Galileo
Try Galileo
Agent Leaderboard