
Episode 63: Why Gemini 3 Will Change How You Build AI Agents with Ravin Kumar (Google DeepMind)
Vanishing Gradients
00:00
Practical eval design: code checks and judges
Hugo and Ravin describe py-based checks, LLM judges, and gold datasets as the foundation of robust eval harnesses.
Play episode from 31:44
Transcript


