
98 - Analyzing Information Flow In Transformers, With Elena Voita
NLP Highlights
00:00
How to Prune a Model?
In the oreginal model, outpots of which and multified attention are concacinated. And for each of the model, the most important had on the first layers is etentient red. So intuitively, why do we want to many miles the probably tatits open? So if the dunde, the model doesn't have to ruin some heads, really, it wants to do nothing. If i don't say it's explicitly model we want to remove some heads or from you it doesn't has to do it. Wik wecan push, more or less, to switchless important heads.
Transcript
Play full episode