Doom Debates

Why AI Alignment Is 0% Solved — Ex-MIRI Researcher Tsvi Benson-Tilsen

13 snips
Oct 31, 2025
Tsvi Benson-Tilsen, a former MIRI researcher, spent seven years grappling with AI alignment challenges. He reveals a stark truth: humanity has made virtually no progress on this complex issue. Tsvi delves into critical concepts like reflective decision theory and corrigibility, illuminating why controlling superintelligence is so daunting. He discusses the implications of self-modifying AIs and the risks of ontological crises, prompting important debates about the limitations of current AI models and the urgent need for effective alignment strategies.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Alignment Progress Is Essentially Zero

  • Tsvi argues we have made essentially 0% progress on solving AI alignment due to slippery, pre-paradigm problems.
  • He highlights sociological and funding barriers that hinder deep theoretical work on alignment.
INSIGHT

Model Minds That Modify Themselves

  • MIRI studies reflective probability and decision theory to model minds that reason about and modify themselves.
  • The goal is to find stable descriptions that remain true as a mind self-modifies.
INSIGHT

Logical Uncertainty Matters For Self-Reflection

  • Logical uncertainty arises because agents must assign probabilities to facts they could deduce but computationally cannot.
  • Reflective reasoning about one's future decisions forces broad logical uncertainty.
Get the Snipd Podcast app to discover more snips from this episode
Get the app