The Valmy cover image

The Valmy

Paul Christiano - Preventing an AI Takeover

Nov 1, 2023
03:07:01
Podcast: Dwarkesh Podcast
Episode: Paul Christiano - Preventing an AI Takeover
Release date: 2023-10-31

Get Podcast Transcript →
powered by Listen411 - fast audio-to-text and summarization


Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!

We discuss:

- Does he regret inventing RLHF, and is alignment necessarily dual-use?

- Why he has relatively modest timelines (40% by 2040, 15% by 2030),

- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?

- Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,

- His current research into a new proof system, and how this could solve alignment by explaining model's behavior

- and much more.

Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.

For more information and to apply, please see the application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/

The deadline to apply is November 9th; make sure to check out those roles before they close.

Timestamps

(00:00:00) - What do we want post-AGI world to look like?

(00:24:25) - Timelines

(00:45:28) - Evolution vs gradient descent

(00:54:53) - Misalignment and takeover

(01:17:23) - Is alignment dual-use?

(01:31:38) - Responsible scaling policies

(01:58:25) - Paul’s alignment research

(02:35:01) - Will this revolutionize theoretical CS and math?

(02:46:11) - How Paul invented RLHF

(02:55:10) - Disagreements with Carl Shulman

(03:01:53) - Long TSMC but not NVIDIA



Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode