The Gradient: Perspectives on AI cover image

Some Changes at The Gradient

The Gradient: Perspectives on AI

00:00

Advancements in AI Evaluation and Research Developments

This chapter explores the release of GSM-1K, a dataset for evaluating data contamination in AI benchmarks, alongside the introduction of PlanSearch for inference time compute. It also announces an upcoming evaluation titled 'humanity's last exam', setting the stage for more interactive assessments in the future.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app