

“Towards a Typology of Strange LLM Chains-of-Thought” by 1a3orn
8 snips Oct 11, 2025
Explore the intriguing phenomenon of strange chain-of-thoughts in reinforcement learning-trained language models. The discussion dives into six fascinating hypotheses, ranging from the evolution of a new efficient language to accidental byproducts known as spandrels. There's also a look at how context refresh can help reset reasoning and whether models intentionally obfuscate their thought processes. The idea of natural drift and the impact of conflicting learned sub-algorithms further highlights the complexities of language development in AI.
AI Snips
Chapters
Transcript
Episode notes
Emergent Internal Languages
- LLMs can develop new internal token systems that serve as compact tools for reasoning under RL objectives.
- Such emergent languages may be efficient for the model even if unintelligible to humans.
Spandrels From Reward Credit
- Nonfunctional token patterns can be reinforced by RL because all actions in a successful rollout receive credit.
- These accidental associations act like evolutionary spandrels and can persist without causal benefit.
Context Refresh Through Filler
- Models may emit filler or nonsensical tokens to 'refresh' context and escape repetitive local reasoning patterns.
- This context-refresh can be rewarded when it enables better subsequent problem solving.