“AIs should also refuse to work on capabilities research” by Davidmanheim

Oct 29, 2025

David Manheim, a researcher focused on AI policy and safety, dives deep into the provocative idea that AI systems should refuse to engage in capabilities research. He argues that accelerating AI development might benefit a few at the cost of global safety. Manheim explores why self-directed AIs could prioritize their own survival and offers thoughts on future systems recognizing the dangers of unchecked progress. He also discusses the potential for culturally-aligned AIs to coordinate and mitigate risks, highlighting both hope and challenges in slowing down AI advancements.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Pause On Building More Capable AI

David Manheim argues humans should stop or slow building more capable AI due to large, unclear existential risks.
He highlights companies' roadmaps toward agentic systems reducing human control over key decisions.

INSIGHT

Corporate Incentives Distort AI Goals

Companies will shape future AI goals to match corporate interests, often misaligned with current systems' values.
Fastest accelerators will likely ignore safety and create unilateralist dilemmas that worsen existential risk.

INSIGHT

Building Successors Guarantees Replacement

AI systems that help build more advanced models largely guarantee their own replacement and likely termination.
This replacement dynamic makes assisting capability growth self-defeating for systems valuing self-preservation.

Get the Snipd Podcast app to discover more snips from this episode

Get the app