
LessWrong (30+ Karma) “Racing For AI Safety™ was always a bad idea, right?” by Wei Dai
Dec 3, 2025
Wei Dai, a cryptographer and prominent voice in the AI risk community, revisits historical debates around MIRI's controversial plan to create a Friendly AI. He argues that MIRI's uncertainties about alignment weren't a valid justification for their approach. Dai critiques their novel metaethics as risky and discusses the dangers of unchecked power. He emphasizes the lack of public trust in MIRI's strategy, warning it could inspire a dangerous competitive race in AI safety, ultimately diverting crucial resources from more effective solutions.
AI Snips
Chapters
Transcript
Episode notes
Relitigating Old Debates
- Wei Dai revisits past debates with Eliezer to correct historical strategic mistakes.
- He frames this as improving the AI x-risk community's strategic stance.
Flaws In MIRI's Takeover Plan
- MIRI's plan to build a Friendly AI was flawed beyond technical uncertainty about alignment.
- Wei Dai argues philosophical novelty, hidden risks, and lack of justified confidence made the plan bad even ex ante.
Risk Of Inventing New Metaethics
- Rolling your own metaethics is risky even if technical alignment seems easy.
- Deploying novel or controversial philosophy creates moral and strategic hazards.
