AI & I cover image

We Taught AI to Play Games—Now It’s a $3.6 Million Company

AI & I

00:00

Why Static Benchmarks Fail for LLMs

Dan and Alex discuss the limits of static evaluations and why dynamic, game-based tests reveal more about model capabilities.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app