Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)

12 snips

Aug 29, 2025

Shreya Shankar, a PhD candidate at UC Berkeley with experience at Google Brain and Facebook, dives into the world of AI agents and document processing. She sheds light on how LLMs can efficiently handle vast amounts of data, maintaining accuracy without breaking the bank. Topics include the importance of human error review, the intricacies of transforming LLM workflows into reliable pipelines, and the balance of using cheap vs. expensive models. Shreya also discusses how guardrails and structured approaches can enhance LLM outputs in real-world applications.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Treat LLM Workflows As ETL Pipelines

Treat LLM workflows as ETL: map operators extract attributes and reduce operators aggregate or summarize.
Search for the most accurate pipeline first, then optimize for cost while meeting accuracy guarantees.

ADVICE

Use Guardrails For Flaky LLM Outputs

Add retries, code-based validators, and cheap LLM checks to catch flaky outputs like empty strings or garbage.
Use 'gleaning' to rerun or validate until outputs meet simple properties like minimum theme counts.

ANECDOTE

Statewide Police Records Project

Berkeley's California Police Records Access Project used LLMs to build a police-misconduct database that would take humans 35 years to compile.
The project required careful prompt specs, iterations, and intern-led error analysis before full deployment.

Get the Snipd Podcast app to discover more snips from this episode

Get the app