Machine Learning Street Talk (MLST) cover image

Neel Nanda - Mechanistic Interpretability

Machine Learning Street Talk (MLST)

CHAPTER

Exploring Superposition in Neural Networks

This chapter investigates the concept of superposition in neural networks, explaining how models can encode more features than available neurons. It examines the challenges of extracting meaningful features and the complexities of feature interaction, particularly in high-dimensional spaces. Additionally, the discussion includes critical insights into empirical research on superposition's implications for mechanistic interpretability in models like GPT-2.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner