“Problems I’ve Tried to Legibilize” by Wei Dai

Nov 10, 2025

Wei Dai, a veteran contributor to the LessWrong community and a thinker in AI safety, dives into the intricacies of making AI risk legible. He discusses key philosophical issues, including decision theory and metaethics, and critiques alignment proposals like utilitarianism. Dai highlights human-AI safety risks due to value conflicts and draws attention to the challenges of making complex problems accessible to decision makers. He remains hopeful that future AI advisors can enhance our understanding of these vital issues in AI governance.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Legibilizing Hard Problems

Wei Dai frames much of his work as making hard problems legible to himself and others.
Legibilizing clarifies what needs attention in AI risk and guides future work.

INSIGHT

Philosophical Foundations Matter

Wei Dai lists core philosophical problems that need clearer framing, including probability, decision theory, and metaethics.
These foundational issues shape how we reason about AI risk and possible interventions.

INSIGHT

Scrutinize Alignment Proposals

Dai highlights problems with specific alignment ideas like Solomonoff induction, corrigibility, and provable safety.
Critically examining these concepts prevents overreliance on seemingly rigorous but possibly flawed solutions.

Get the Snipd Podcast app to discover more snips from this episode

Get the app