
LessWrong (Curated & Popular) “SAE feature geometry is outside the superposition hypothesis” by jake_mendel
Jun 25, 2024
Exploring the limitations of superposition in neural network activation spaces, focusing on feature geometry and the importance of specific feature vector locations. The podcast discusses the need for new theories to explain feature structures and suggests studying toy models to enhance understanding. Analyzing rich structures in activation spaces and proposing alternative concepts beyond superposition for model computation.
Chapters
Transcript
Episode notes
