Google AI: Release Notes cover image

Demis Hassabis on shipping momentum, better evals and world models

Google AI: Release Notes

00:00

Game Arena's Scaling

  • Game Arena's tests get automatically harder as AI systems improve, unlike benchmarks like Amy or GPQA where humans must create increasingly difficult questions.
  • The uniqueness of each game, created by players, benefits testing, as it prevents overfitting on training data.
Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app