AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is There a Shift in Interpretability?
There was a shift from example-based concept-based interpretability that has occurred both in your own work and interpretability more broadly. And even as recently as I think 2016, you had this paper, examples are not enough learned to criticize where you were looking into this MMD critic method. Could you tell me a little bit about that shift in thinking? How that occurred for you?