AXRP - the AI X-risk Research Podcast cover image

6 - Debate and Imitative Generalization with Beth Barnes

AXRP - the AI X-risk Research Podcast

CHAPTER

Are You Doing This Bad Thing?

The point of debate is to get some way of eliciting the knowledge of an a i, so that you can ask it, hey, are you doing this bad thing? Om. And then, like, if it is, we can just be like, ok, well, i won't let you, right? Or you can train it to noght, you can train four models that don't intend to do soch, bad thing. So just just shake that. Am, i'm a right to say that the basic idea is like, whatever, whatever bad thing you're worried about an i deliberately doing. But im one, i guess the question i have is, what

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner