
6 - Debate and Imitative Generalization with Beth Barnes
AXRP - the AI X-risk Research Podcast
Are You Doing This Bad Thing?
The point of debate is to get some way of eliciting the knowledge of an a i, so that you can ask it, hey, are you doing this bad thing? Om. And then, like, if it is, we can just be like, ok, well, i won't let you, right? Or you can train it to noght, you can train four models that don't intend to do soch, bad thing. So just just shake that. Am, i'm a right to say that the basic idea is like, whatever, whatever bad thing you're worried about an i deliberately doing. But im one, i guess the question i have is, what
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.