
41 - Lee Sharkey on Attribution-based Parameter Decomposition
AXRP - the AI X-risk Research Podcast
00:00
Understanding Attribution-based Parameter Decomposition
This chapter dives deep into Attribution-based Parameter Decomposition (APD) through analogies of car components, illustrating how different elements like brakes and speedometers contribute contextually to system behavior. It emphasizes the importance of thorough input analysis and the nuanced role of neural network parameters across layers to avoid misinterpretations in complex systems. The chapter also addresses challenges in hyperparameter selection and the trade-offs between model complexity and performance, while connecting these ideas to principles of minimum description length.
Transcript
Play full episode