Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

2974 snips

May 22, 2025

Guest

Trenton Bricken

Guest

Sholto Douglas

In a fascinating conversation, Sholto Douglas, a reinforcement learning researcher at Anthropic, and Trenton Bricken, an expert in mechanistic interpretability, dive deep into the evolving landscape of AI. They discuss the latest advancements in reinforcement learning and the implications of AI achieving human-level tasks. The duo explores how to trace AI models' thought processes and the challenges of aligning AI with human values. They also address the future of AI in workplaces, emphasizing the need for individuals to adapt and engage with these transformative technologies.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

RL Achieves Expert Human Performance

Reinforcement learning with language models has finally proven to achieve expert human-level performance in complex tasks like competitive programming and math.
Scalable long-running autonomous agentic performance is emerging now and expected to improve significantly within a year.

INSIGHT

Why Software Excels in RL

Software engineering benefits from very clear, verifiable reward signals like passing unit tests, making RL highly effective.
Creative tasks such as writing require taste, which is harder to quantify and reward precisely.

INSIGHT

RL Adds New Knowledge to Models

RL can teach neural nets new knowledge beyond pre-training given a clean and sufficient reward signal.
The key limitation is often the availability and quality of feedback, which impacts learning capacity.

Get the Snipd Podcast app to discover more snips from this episode

Get the app