MLOps.community  cover image

Are Evals Dead?

MLOps.community

00:00

Using model judges carefully and surfacing real errors

Chiara warns that LLM judges can overlook faults and encourages instructing them to find and report real errors.

Play episode from 14:20
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app