The Gradient: Perspectives on AI cover image

Some Changes at The Gradient

The Gradient: Perspectives on AI

CHAPTER

Advancements in AI Evaluation and Research Developments

This chapter explores the release of GSM-1K, a dataset for evaluating data contamination in AI benchmarks, alongside the introduction of PlanSearch for inference time compute. It also announces an upcoming evaluation titled 'humanity's last exam', setting the stage for more interactive assessments in the future.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner