AXRP - the AI X-risk Research Podcast cover image

41 - Lee Sharkey on Attribution-based Parameter Decomposition

AXRP - the AI X-risk Research Podcast

00:00

Understanding Attribution-based Parameter Decomposition

This chapter dives deep into Attribution-based Parameter Decomposition (APD) through analogies of car components, illustrating how different elements like brakes and speedometers contribute contextually to system behavior. It emphasizes the importance of thorough input analysis and the nuanced role of neural network parameters across layers to avoid misinterpretations in complex systems. The chapter also addresses challenges in hyperparameter selection and the trade-offs between model complexity and performance, while connecting these ideas to principles of minimum description length.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app