NLP Highlights cover image

98 - Analyzing Information Flow In Transformers, With Elena Voita

NLP Highlights

00:00

The Importance of Attention in the Model

So we can prune like two third of all heads as all specialized functions being alive. So basically all functions are aliver until we have the seven heads. And then if ye push forward a heads, start ratin several functions, for example, a toros and pactic functions. That's really interesting. Did you notice that heds that actually survive the pruning process performing?

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app