Latent Space: The AI Engineer Podcast cover image

Brex’s AI Hail Mary — With CTO James Reggio

Latent Space: The AI Engineer Podcast

00:00

Evals: Ops QA and Multi-Turn Testing

Evals are co-developed with ops; multi-turn agent tests use LLM-as-judge and targeted integration test techniques.

Play episode from 52:14
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app