Analyzing the Structure of Feature Vectors and Searching for New Theories

Exploring beyond the superposition hypothesis, this chapter investigates feature vectors in UMAPs to uncover rich structures and evidence from feature splitting. By proposing three classes of theories and pursuing various methods like analyzing big SAEs and LLM representations, the chapter aims to advance the understanding of activation spaces.

Play episode from 06:28

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app