
EA Forum Podcast (Curated & popular) “Leaving Open Philanthropy, going to Anthropic” by Joe_Carlsmith
11 snips
Nov 6, 2025 Joe Carlsmith, a senior researcher specializing in AI risks, recently transitioned from Open Philanthropy to Anthropic. He reflects on his impactful tenure at Open Philanthropy, discussing the importance of worldview investigations and AI safety research. Joe shares his aspirations for designing Claude's character at Anthropic and weighs the significance of model-spec design in mitigating existential risks. He addresses the complexities of working within frontier labs, advocating for balancing capability restraint with safety progress, all while navigating potential personal and ethical challenges in his new role.
AI Snips
Chapters
Transcript
Episode notes
Leading OpenPhil's Worldview Project
- Joe Carlsmith recounts joining and leading Open Philanthropy's Worldview Investigations team starting in 2019.
- He describes the team's mandate to document big-picture views on AI, produce research, and make those views publicly inspectable.
Model Spec Design Is A New Kind Of Problem
- Carlsmith argues designing a model spec for Claude is an unprecedented technical and philosophical challenge with rising stakes as AIs gain influence.
- He sees his background as especially suited to helping shape model character, despite debates about whether spec design is the crucial leverage point.
Work Both On Specs And Obedience
- Try engaging with both designing robust specs and ensuring obedience to those specs, since both affect catastrophic risk.
- Work at an AI firm can expose you to interactions between spec content and obedience dynamics, informing safer design choices.

