MLOps.community  cover image

Are Evals Dead?

MLOps.community

00:00

Using model judges carefully and surfacing real errors

Chiara warns that LLM judges can overlook faults and encourages instructing them to find and report real errors.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app