Recently I've been relitigating some of my old debates with Eliezer, to right the historical wrongs. Err, I mean to improve the AI x-risk community's strategic stance. (Relevant to my recent theme of humans being bad at strategy—why didn't I do this sooner?)
Of course the most central old debate was over whether MIRI's plan, to build a Friendly AI to take over the world in service of reducing x-risks, was a good one. If someone were to defend it today, I imagine their main argument would be that back then, there was no way to know how hard solving Friendliness/alignment would be, so it was worth a try in case it turned out to be easy. This may seem plausible because new evidence about the technical difficulty of alignment was the main reason MIRI pivoted away from their plan, but I want to argue that actually even without this information, there were good enough arguments back then to conclude that the plan was bad:
- MIRI was rolling their own metaethics (deploying novel or controversial philosophy) which is not a good idea even if alignment turned out to be not that hard in a technical sense.
[...]
---
First published:
November 16th, 2025
Source:
https://www.lesswrong.com/posts/dGotimttzHAs9rcxH/racing-for-ai-safety-tm-was-always-a-bad-idea-right
---
Narrated by TYPE III AUDIO.