Eye On A.I. cover image

#310 Stefano Ermon: Why Diffusion Language Models Will Define the Next Generation of LLMs

Eye On A.I.

00:00

Evaluation and benchmarks

Stefano describes using community benchmarks to track math, instruction following, QA, and code performance.

Play episode from 22:36
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app