The Inside View cover image

Neel Nanda on mechanistic interpretability, superposition and grokking

The Inside View

00:00

Generalization of Tiny Models on Algorithmic Tasks

This chapter discusses a research paper that explores the phenomenon of tiny models initially memorizing training data on algorithmic tasks, but gradually generalizing to perform the task. The speaker explains the background of the paper and their discovery of a different family of tasks related to composition of permutations. They delve into representation theory and its involvement in understanding groups through linear transformations of vector spaces.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app