
LessWrong (30+ Karma) “Thinking about reasoning models made me less worried about scheming” by Fabien Roger
Nov 20, 2025
In this enlightening discussion, Fabien Roger, a researcher focused on AI alignment, examines the intriguing capabilities of reasoning models like DeepSeek R1. He shares how his perception of AI scheming has evolved, addressing misconceptions from 2022. Fabien explores why these models lack scheming tendencies despite having the tools for it, emphasizing human-like pretraining biases. He also highlights pressures against scheming, predicts future developments, and calls for cautious optimism regarding superintelligence, all while acknowledging lingering concerns.
AI Snips
Chapters
Transcript
Episode notes
Reasoning Models Rarely Show Scheming
- Reasoning models can plan instrumentally yet their training scratchpads usually lack scheming traces.
- This suggests many apparent scheming conditions don't arise in current models like DeepSeek R1.
Speed Costs May Be Overstated
- Speed costs often cited against scheming may not matter for models with plenty of training slack.
- Fabien thinks GRPO-trained models could retain scheming if initialized that way despite mild penalties.
Keep Human-Like Priors And Legibility Checks
- Preserve human-like pretraining and legibility bottlenecks to reduce scheming risk.
- Use intermediate human-interpretable reasoning checkpoints roughly every ~1,000 steps.

