
"Against Almost Every Theory of Impact of Interpretability" by Charbel-Raphaël
LessWrong (Curated & Popular)
00:00
Interpreting the Limits of Interpretability
Exploring challenges in auditing deception in interpretability and questioning its value compared to other technical work. Critiquing interpretability techniques like Grad cam and pixel attribution, while discussing limitations and effectiveness in industry applications. Offering alternative perspectives on predicting future systems beyond the conventional theory of impact of interpretability.
Play episode from 02:09
Transcript


