

Everything Hard About Building AI Agents Today
116 snips Jun 13, 2025
Join Willem Pienaar, CTO of Cleric and creator of Feast, along with PhD student Shreya Shankar, as they tackle the toughest challenges in building AI agents. They discuss the ambiguity of 'ground truth' in evaluations, revealing three key gulfs of human-AI interaction that hinder success. The duo emphasizes the importance of moving humans out of the feedback loop, using implicit signals for faster learning. Practical techniques like heat maps for task failures and the complexities of simulated environments are also explored, shedding light on the inevitable performance ceiling of AI.
AI Snips
Chapters
Transcript
Episode notes
Challenge of AI Verification in Production
- Verification of AI system results in production is complex due to lack of clear ground truth.
- Rapid learning loops that bypass human review are essential to improve AI agents effectively.
Doc ETL's LLM MapReduce Pipeline
- Shreya's Doc ETL system uses LLMs as map and reduce operators to process vast unstructured data.
- Verification is challenging because users don't know if the LLM missed anything in the data.
Bridging AI Communication Gulfs
- Successful AI products must bridge the "gulfs" of specification and generalization in user intent communication.
- Tools for detailed prompt engineering improve specification; other strategies address generalization errors.