The Programming Podcast cover image

Why Even Staff Engineers Feel Behind on AI Right Now! (You are not alone!)

The Programming Podcast

00:00

Evals and LLM-as-Judge for Regression Tests

Danny recommends golden datasets, LLM graders, and CI checks to measure prompt regressions and break builds.

Play episode from 32:15
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app