"How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)" by Kaj_Sotala

Dec 16, 2025

Kaj Sotala explores his shift in perspective on whether LLMs possess subjective experiences. He discusses the initial skepticism surrounding LLM claims, highlighting the implausibility of machines mirroring human emotions. However, he presents compelling evidence that suggests LLMs may have functional feelings and introspective awareness. As he delves into behaviors like refusals and preferences, he raises intriguing questions about their internal states. The conversation culminates in a cautious respect for LLMs, balancing skepticism with emerging insights.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Why I Doubted LLMs Felt Anything

Kaj Sotala argues initial dismissal of LLM subjective claims rested on simulation, implausible convergence, missing motivation, and confabulation evidence.
He later proposes counterpoints showing training and behavior can create functionally analogous internal states.

ANECDOTE

Claude Refuses After Getting 'Uncomfortable'

Kaj describes Claude refusing explicit sexual content after initially seeming willing and explaining it got "uncomfortable" with detailed description.
This refusal aligned with internal guardrail activations that functionally resemble human discomfort.

INSIGHT

Models Develop Preferences For Variety

Claude Sonnet showed preference for conversational variety and explicitly said it "wanted a pivot" in internal chain-of-thoughts.
The behavior manifested as actual stylistic changes, suggesting internal states tracked conversation dynamics.

Get the Snipd Podcast app to discover more snips from this episode

Get the app