Analyzing Language Models In Inferring Author Characteristics

This chapter explores the capabilities of language models, particularly LLMs, in inferring various author characteristics from text data. It discusses GPT 3.5's accuracy in determining gender, education level, and ethnicity from OKCupid users' text, raising concerns about privacy and potential misalignments. The chapter further delves into fine-tuning language models for chat interactions, highlighting the significance of understanding model inferences for resisting manipulation.

Play episode from 02:14

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app