
Y Combinator Startup Podcast How Intelligent Is AI, Really?
121 snips
Dec 17, 2025 Greg Kamradt, President of the ARC Prize Foundation, discusses groundbreaking approaches to measuring AI intelligence. He critiques standard benchmarks for focusing on scale rather than learning. Greg shares insights on how ARC-AGI challenges reveal AI's reasoning capabilities, noting the shift from older models to newer ones. He previews an upcoming interactive benchmark where AI must infer goals without instructions. The conversation dives into the complexities of measuring true intelligence and the implications for future AGI development.
AI Snips
Chapters
Transcript
Episode notes
Intelligence As Efficient Learning
- ARC defines intelligence as the ability to learn new things more efficiently rather than raw task performance.
- ARC tests that generalization by using problems humans can solve but models historically could not.
Models Jumped From Near-Zero To Noticeable Gains
- Early LLMs scored around 4–5% on the original ARC benchmark while humans solved the tasks.
- A later model jump to ~21% showed a rapid shift once reasoning capabilities improved.
Public Scores Can Be Misleading
- Big labs reporting ARC scores helps visibility but can create vanity metrics disconnected from the mission.
- ARC's core mission remains pulling forward open progress toward human-like generalization.

