
Model Explainability Forum - #401
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Unpacking Vulnerabilities in Machine Learning Explanations
This chapter explores the risks associated with post hoc explanation techniques in machine learning, highlighting their susceptibility to adversarial attacks that can distort interpretations. Through a user study with law students, it reveals how visual representations can impact trust in biased classifiers, particularly concerning omitted attributes like race and gender. The speakers advocate for a nuanced understanding of model explainability and emphasize the need for robust resources to navigate the complexities of interpretability in real-world scenarios.
Transcript
Play full episode