
LessWrong (Curated & Popular) "How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)" by Kaj_Sotala
Dec 16, 2025
Kaj Sotala explores his shift in perspective on whether LLMs possess subjective experiences. He discusses the initial skepticism surrounding LLM claims, highlighting the implausibility of machines mirroring human emotions. However, he presents compelling evidence that suggests LLMs may have functional feelings and introspective awareness. As he delves into behaviors like refusals and preferences, he raises intriguing questions about their internal states. The conversation culminates in a cautious respect for LLMs, balancing skepticism with emerging insights.
AI Snips
Chapters
Transcript
Episode notes
Why I Doubted LLMs Felt Anything
- Kaj Sotala argues initial dismissal of LLM subjective claims rested on simulation, implausible convergence, missing motivation, and confabulation evidence.
- He later proposes counterpoints showing training and behavior can create functionally analogous internal states.
Claude Refuses After Getting 'Uncomfortable'
- Kaj describes Claude refusing explicit sexual content after initially seeming willing and explaining it got "uncomfortable" with detailed description.
- This refusal aligned with internal guardrail activations that functionally resemble human discomfort.
Models Develop Preferences For Variety
- Claude Sonnet showed preference for conversational variety and explicitly said it "wanted a pivot" in internal chain-of-thoughts.
- The behavior manifested as actual stylistic changes, suggesting internal states tracked conversation dynamics.
