3min snip

Dwarkesh Podcast cover image

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Dwarkesh Podcast

NOTE

Anthropic’s superposition paper claims that models are under-parameterized

The superposition hypothesis in Anthropic's paper suggests that models are under-parameterized when faced with high-dimensional and sparse data, such as modeling internet data. This leads to a compression strategy where the model can pack more features of the world into it than it has parameters. Superposition arises in such regimes to handle the sparse and high-dimensional nature of the data. The difficulty in interpreting neural networks is attributed to this superposition effect, where neurons contribute to the model's output in a confusing and mixed manner. By projecting activations into a higher dimensional space and applying a sparsity penalty, the compression can be undone, resulting in cleaner and more interpretable features. Contrary to popular belief, the claim in the paper is that deep learning models are dramatically under parameterized, given the complexity of the tasks they are designed to handle.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode