
LessWrong (30+ Karma) Unless its governance changes, Anthropic is untrustworthy
Nov 29, 2025
The discussion centers on Anthropic's trustworthiness, highlighting the disconnect between their founding promises and current actions. Leaders initially vowed to prioritize safety and avoid pushing AI frontiers, yet evidence suggests they have shifted towards aggressive capability advancement. The podcast critiques internal governance issues and reveals lobbying attempts to undermine regulation, alongside private assurances contradicting public statements. It calls for employees to demand accountability and transparency, urging a reevaluation of the company's true mission.
AI Snips
Chapters
Transcript
Episode notes
Founding Purpose Versus Reality
- Anthropic was founded to do safety research on frontier models, not to race for capabilities.
- Instead, leadership drifted toward racing and commercial focus, undermining that original purpose.
Commitment Perception From Investors
- Multiple investors and employees came away from conversations with Dario Amodei believing he committed not to push the frontier.
- Later releases like Claude 3 Opus contradicted that impression, creating perceptions of broken promises.
No Operational Plan For Pessimistic Scenarios
- Anthropic claimed it would act as if we're in a pessimistic alignment scenario until proven otherwise.
- The company hasn't shown institutionalized triggers or public actions consistent with that assumption.
