AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Anthropic’s superposition paper claims that models are under-parameterized
The superposition hypothesis in Anthropic's paper suggests that models are under-parameterized when faced with high-dimensional and sparse data, such as modeling internet data. This leads to a compression strategy where the model can pack more features of the world into it than it has parameters. Superposition arises in such regimes to handle the sparse and high-dimensional nature of the data. The difficulty in interpreting neural networks is attributed to this superposition effect, where neurons contribute to the model's output in a confusing and mixed manner. By projecting activations into a higher dimensional space and applying a sparsity penalty, the compression can be undone, resulting in cleaner and more interpretable features. Contrary to popular belief, the claim in the paper is that deep learning models are dramatically under parameterized, given the complexity of the tasks they are designed to handle.