80,000 Hours Podcast cover image

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

80,000 Hours Podcast

00:00

Exploring Introspection and Task Preference in AI Models

This chapter examines the concept of introspection in AI models, focusing on research aimed at understanding their self-predictive capabilities. It also discusses task preference assessments that evaluate models' behavioral inclinations through their choices in diverse tasks.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app