Episode 51: Why We Built an MCP Server and What Broke First

18 snips

Jun 26, 2025

In this discussion, Philip Carter, Product Management Director at Salesforce and former Principal PM at Honeycomb, shares insights on creating LLM-powered features. He explains the nuances of integrating real production data with these systems. Carter dives into the challenges of tool use, prompt templates, and flaky model behavior. He also discusses the development of the innovative MCP server that enhances observability in AI systems, emphasizing its role in improving user experience and navigating the pitfalls of SaaS product development.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Spreadsheet-Driven LLM Eval Process

Philip Carter detailed using spreadsheets to collect and analyze real user inputs and outputs for their LLM feature.
By iterating judgments between himself and an LLM judge, they created a highly aligned evaluation system for better product performance.

ADVICE

Levers to Align LLM and Fix Errors

Improve your LLM-driven system by tuning prompts, few-shot examples, and adding deterministic post-processing rules.
Use observability to spot and fix real user data errors quickly for practical improvements.

INSIGHT

MCP Enables Live Data Integration

MCP bridges general-purpose LLMs with live APIs, unlocking broad workflow orchestration across enterprise tools.
Real-world data scale and idiosyncrasies challenge LLM reliability and context window limits, requiring ongoing engineering efforts.

Get the Snipd Podcast app to discover more snips from this episode

Get the app