
6 - Debate and Imitative Generalization with Beth Barnes
AXRP - the AI X-risk Research Podcast
The Importance of More Interpretability
Beclare: I'm really excited about people just doing a bunch more interpretability. There's some kind of window for a sort of treachurous town, where there's the between when a model is smart enough to think about a deception and when it's good enough to get away with it. Beclare: Even if we can just do very crude things, like, you know, is this model just like thinking about some category of stuff, it really shouldn't be thinking about or using without an agent? And i think even you think that's just like, way to demanding, and we're never going to bealed to do anything like that.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.