EA Forum Podcast (Curated & popular)

“Leaving Open Philanthropy, going to Anthropic” by Joe_Carlsmith

11 snips
Nov 6, 2025
Joe Carlsmith, a senior researcher specializing in AI risks, recently transitioned from Open Philanthropy to Anthropic. He reflects on his impactful tenure at Open Philanthropy, discussing the importance of worldview investigations and AI safety research. Joe shares his aspirations for designing Claude's character at Anthropic and weighs the significance of model-spec design in mitigating existential risks. He addresses the complexities of working within frontier labs, advocating for balancing capability restraint with safety progress, all while navigating potential personal and ethical challenges in his new role.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Leading OpenPhil's Worldview Project

  • Joe Carlsmith recounts joining and leading Open Philanthropy's Worldview Investigations team starting in 2019.
  • He describes the team's mandate to document big-picture views on AI, produce research, and make those views publicly inspectable.
INSIGHT

Model Spec Design Is A New Kind Of Problem

  • Carlsmith argues designing a model spec for Claude is an unprecedented technical and philosophical challenge with rising stakes as AIs gain influence.
  • He sees his background as especially suited to helping shape model character, despite debates about whether spec design is the crucial leverage point.
ADVICE

Work Both On Specs And Obedience

  • Try engaging with both designing robust specs and ensuring obedience to those specs, since both affect catastrophic risk.
  • Work at an AI firm can expose you to interactions between spec content and obedience dynamics, informing safer design choices.
Get the Snipd Podcast app to discover more snips from this episode
Get the app