Everything Hard About Building AI Agents Today

116 snips

Jun 13, 2025

Guest

Shreya Shankar

Guest

Willem Pienaar

Join Willem Pienaar, CTO of Cleric and creator of Feast, along with PhD student Shreya Shankar, as they tackle the toughest challenges in building AI agents. They discuss the ambiguity of 'ground truth' in evaluations, revealing three key gulfs of human-AI interaction that hinder success. The duo emphasizes the importance of moving humans out of the feedback loop, using implicit signals for faster learning. Practical techniques like heat maps for task failures and the complexities of simulated environments are also explored, shedding light on the inevitable performance ceiling of AI.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Challenge of AI Verification in Production

Verification of AI system results in production is complex due to lack of clear ground truth.
Rapid learning loops that bypass human review are essential to improve AI agents effectively.

ANECDOTE

Doc ETL's LLM MapReduce Pipeline

Shreya's Doc ETL system uses LLMs as map and reduce operators to process vast unstructured data.
Verification is challenging because users don't know if the LLM missed anything in the data.

INSIGHT

Bridging AI Communication Gulfs

Successful AI products must bridge the "gulfs" of specification and generalization in user intent communication.
Tools for detailed prompt engineering improve specification; other strategies address generalization errors.

Get the Snipd Podcast app to discover more snips from this episode

Get the app