The Inside View cover image

Neel Nanda on mechanistic interpretability, superposition and grokking

The Inside View

00:00

Generalization of Tiny Models on Algorithmic Tasks

This chapter discusses a research paper that explores the phenomenon of tiny models initially memorizing training data on algorithmic tasks, but gradually generalizing to perform the task. The speaker explains the background of the paper and their discovery of a different family of tasks related to composition of permutations. They delve into representation theory and its involvement in understanding groups through linear transformations of vector spaces.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app