The Nonlinear Library

LW - Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition by cmathw

Apr 9, 2024
Ask episode
Chapters
Transcript
Episode notes