
6 - Debate and Imitative Generalization with Beth Barnes
AXRP - the AI X-risk Research Podcast
How to Interpret a Big Nural Net in a Test Set?
The problem of how to train a model for interpretability. How do you represent things in such a way that the human has meaningful understanding, and it's like reasonably efficient? And this, i think, ends up being pretty close to the sort of hard problems of interpretability. I mean, it seems like there are a lot of long treatises that you have to deal with. But we also, like, if you imagine representing everything that alphafold knows in text, it's just going to be horribly inefficient to try an like otise that and get the do it. The one that's like just trust this big black box, which seems sort of doable, but it
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.