“Racing For AI Safety™ was always a bad idea, right?” by Wei Dai

Dec 3, 2025

Wei Dai, a cryptographer and prominent voice in the AI risk community, revisits historical debates around MIRI's controversial plan to create a Friendly AI. He argues that MIRI's uncertainties about alignment weren't a valid justification for their approach. Dai critiques their novel metaethics as risky and discusses the dangers of unchecked power. He emphasizes the lack of public trust in MIRI's strategy, warning it could inspire a dangerous competitive race in AI safety, ultimately diverting crucial resources from more effective solutions.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Relitigating Old Debates

Wei Dai revisits past debates with Eliezer to correct historical strategic mistakes.
He frames this as improving the AI x-risk community's strategic stance.

INSIGHT

Flaws In MIRI's Takeover Plan

MIRI's plan to build a Friendly AI was flawed beyond technical uncertainty about alignment.
Wei Dai argues philosophical novelty, hidden risks, and lack of justified confidence made the plan bad even ex ante.

INSIGHT

Risk Of Inventing New Metaethics

Rolling your own metaethics is risky even if technical alignment seems easy.
Deploying novel or controversial philosophy creates moral and strategic hazards.

Get the Snipd Podcast app to discover more snips from this episode

Get the app