LessWrong (30+ Karma)

“Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety” by Tomek Korbak, Mikita Balesni, Vlad Mikulik, Rohin Shah

Jul 15, 2025
Ask episode
Chapters
Transcript
Episode notes