80,000 Hours Podcast cover image

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

80,000 Hours Podcast

00:00

Exploring AI Preferences: The Claude Model

This chapter investigates the findings from AI consciousness experiments centered on the Claude model, revealing its strong preference for constructive tasks over harmful ones. It also examines variations within the model that showcase differing competencies and insights into their underlying value systems.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app