
42 - Owain Evans on LLM Psychology
AXRP - the AI X-risk Research Podcast
00:00
Exploring Model Introspection and Self-Awareness
This chapter examines the self-awareness and introspective abilities of large language models as presented in a research paper titled 'Tell Me About Yourself.' The discussion highlights the models' performance in describing their behaviors, the significance of effect sizes, and the challenges of generalizing findings from simple to complex tasks. Through various experimental setups, the chapter emphasizes the need for deeper investigation into the models' understanding of their own decision-making tendencies and how they articulate those insights.
Transcript
Play full episode