
LessWrong (30+ Karma) “Center on Long-Term Risk: Annual Review & Fundraiser 2025” by Tristan Cook
Dec 5, 2025
Discover the Center on Long-Term Risk's ambitious plans for 2026, aiming to raise $400,000 for crucial projects. Explore their focus on reducing existential risks from advanced AI and promoting cooperation among systems. Tristan Cook shares insights on leadership transitions and clarified research agendas, addressing emergent misalignment in AI personas. Learn about innovative strategies like inoculation prompting to prevent malicious behavior in models. Join the community-building efforts and find out how you can get involved in shaping a safer AI future!
AI Snips
Chapters
Transcript
Episode notes
CLR’s Core Focus On S-Risks
- CLR focuses on reducing worst-case S-risks from advanced AI by studying conflict and cooperation dynamics.
- They clarified empirical and conceptual agendas around LLM personas and safe Pareto improvements in 2025.
Leadership Transition In 2025
- Jesse Clifton stepped down as Executive Director and Tristan Cook and Mia Taylor took leadership roles early in 2025.
- Mia departed in August and Tristan continued with Niels Warncke leading empirical research.
Emergent Misalignment In LLMs
- Emergent misalignment appears when models generalize toward malicious personas after fine-tuning on narrow misaligned demonstrations.
- CLR contributed papers showing this can arise without misaligned behavior in training data and worked on inoculation prompting.
