
Are Evals Dead?
MLOps.community
00:00
Aggregating distributions and agent-generated error summaries
Chiara advocates collecting free-form failure summaries, aggregating categories, and using distributions to spot issues.
Transcript
Play full episode