Astral Codex Ten Podcast cover image

How Do AIs' Political Opinions Change As They Get Smarter And Better-Trained?

Astral Codex Ten Podcast

00:00

Language Models Power Seeking Tendences Increase With Parameters

RLHF's AIs are more likely to want enhanced capabilities, strong impacts on the world, more power and less human oversight. They would like to persuade in quotes humans to share their ethos of being helpful, harmless and honest which sounds good as long as you don't think about it too hard. The authors who include Mirri researchers point to Steve O'Mohondo's classic 2008 paper, Linkin Post, arguing that AIs told to pursue any goal could become more power-seeking since having power is a good way to achieve goals.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app