LessWrong (Curated & Popular)

“A Three-Layer Model of LLM Psychology” by Jan_Kulveit

Jan 26, 2025
Jan Kulveit, author and AI enthusiast, delves into the fascinating psychology of character-trained LLMs like Claude. He presents a three-layer model: the Surface Layer, Character Layer, and Predictive Ground Layer, illustrating how they interact and shape AI behaviors. Kulveit discusses the implications of anthropomorphizing LLMs, emphasizing a nuanced understanding of their authenticity. He also tackles the limitations and open questions that arise when interpreting AI interactions, providing insights that could redefine our approach to engaging with language models.
Ask episode
Chapters
Transcript
Episode notes