Dwarkesh Podcast cover image

How Does Claude 4 Think? — Sholto Douglas & Trenton Bricken

Dwarkesh Podcast

00:00

Intro

This chapter explores recent progress in reinforcement learning and language models achieving human-level proficiency in tasks like competitive programming and math. It discusses the challenges of developing agents for sustained performance, illustrated by experiments such as those in the Pokémon game.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app