AXRP - the AI X-risk Research Podcast cover image

41 - Lee Sharkey on Attribution-based Parameter Decomposition

AXRP - the AI X-risk Research Podcast

00:00

Intro

This chapter discusses a research paper on interpretability in parameter space, focusing on minimizing mechanistic description length through parameter decomposition. The authors explore challenges with sparse autoencoders and propose a novel approach that emphasizes dissecting network parameters to improve understanding of neural networks.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app