
Are Evals Dead?
MLOps.community
00:00
Designing short multi-turn simulations and iterative error analysis
Chiara describes 4–5 turn simulations, phased testing, and adding observed failures into eval sets for regression checks.
Play episode from 05:52
Transcript


