"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

Can AIs do AI R&D? Reviewing REBench Results with Neev Parikh of METR

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Intro

This chapter explores the Research Engineering Bench (REBench), a new standard for assessing AI systems in real-world machine learning tasks. It compares AI performance against human experts and discusses the implications of recent advancements that could enable greater automation in AI research and development.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app