AXRP - the AI X-risk Research Podcast cover image

6 - Debate and Imitative Generalization with Beth Barnes

AXRP - the AI X-risk Research Podcast

00:00

Are You Doing This Bad Thing?

The point of debate is to get some way of eliciting the knowledge of an a i, so that you can ask it, hey, are you doing this bad thing? Om. And then, like, if it is, we can just be like, ok, well, i won't let you, right? Or you can train it to noght, you can train four models that don't intend to do soch, bad thing. So just just shake that. Am, i'm a right to say that the basic idea is like, whatever, whatever bad thing you're worried about an i deliberately doing. But im one, i guess the question i have is, what

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app