ARC Prize v2 Launch! (Francois Chollet and Mike Knoop)

164 snips

Mar 24, 2025

Guest

Mike Knoop

Guest

Francois Chollet

Francois Chollet, an AI researcher known for Keras and the ARC challenge, joins Mike Knoop, collaborator on the ARC challenge, to launch the new version of the ARC prize. They discuss how ARC v2 integrates human calibration and adversarial selection, ensuring that even top LLMs struggle against it. The conversation highlights the evolution from ARC v1 to v2, the complexities of AI task design, and the urgent need for rigorous testing methods to bridge the gap between human and AI intelligence in the quest for artificial general intelligence.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

O3's Surprise Performance

OpenAI's O3 model achieved near-human performance on ARC v1, surprising Francois Chollet.
This prompted a two-week testing sprint to understand O3's capabilities and implications.

INSIGHT

Training on ARC

Training on ARC's training set is not cheating; the benchmark encourages it to teach AI systems about the domain.
The private data set tests generalization and abstraction, requiring more than memorization.

INSIGHT

Human Calibration

ARC v2 tasks were solvable by humans at a $5 per-task rate, within five minutes.
Every v2 task is solvable by at least two humans under two attempts.

Get the Snipd Podcast app to discover more snips from this episode

Get the app