Towards Data Science cover image

130. Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

Towards Data Science

CHAPTER

How Do You Know if Your AI Will Consistently Do This?

The idea is roughly speaking like, so I take my AI, I put it in this like maybe this game landscape or whatever. And then I tell it, okay, now I'm going to reward you for, I don't know, getting to this square in the game. And are the things that it seems to consistently like to do the things that this power argument would predict? Is that sort of fair to say? That's exactly right.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner