AXRP - the AI X-risk Research Podcast cover image

42 - Owain Evans on LLM Psychology

AXRP - the AI X-risk Research Podcast

00:00

Exploring Model Introspection and Self-Awareness

This chapter examines the self-awareness and introspective abilities of large language models as presented in a research paper titled 'Tell Me About Yourself.' The discussion highlights the models' performance in describing their behaviors, the significance of effect sizes, and the challenges of generalizing findings from simple to complex tasks. Through various experimental setups, the chapter emphasizes the need for deeper investigation into the models' understanding of their own decision-making tendencies and how they articulate those insights.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app