

Studying Machine Intelligence with Been Kim - #571
11 snips May 9, 2022
Been Kim, a staff research scientist at Google Brain and ICLR 2022 speaker, dives into the fascinating world of AI interpretability. She discusses the current state of interpretability techniques, exploring how Gestalt principles can enhance our understanding of neural networks. Been proposes a novel language for human-AI communication, aimed at improving collaboration and transparency. The conversation also touches on the evolution of AI tools, the unique insights from AlphaZero in chess, and the implications of model fingerprints for data privacy.
AI Snips
Chapters
Books
Transcript
Episode notes
Sanity Check Paper
- Been Kim's 2018 paper revealed that explanations from trained and randomized models are often indistinguishable.
- This highlighted the need for more rigorous validation of interpretability methods.
Interpretability Methods' Reliability
- Interpretability methods are not always reliable, even those deployed in practice.
- Rigorous validation and human experiments are crucial for evaluating their effectiveness.
Choosing Interpretability Methods
- Choose interpretability methods based on the specific task.
- LIME's simplicity is beneficial for some tasks, but its limitations become apparent with complex decision boundaries.