
98 - Analyzing Information Flow In Transformers, With Elena Voita
NLP Highlights
00:00
The Most Elevant Heads Are the Most Confident Heads
There was also some notion of head confidence i saw in the paper. Cand you talk about that subject. Ye, relevance is nice in a way, but it's really hard to wake a few otiates about. So we evaluate confidence by picking marxm, attentionwat for each token and then taken an average. Intuitively, confident heads are the ones which tend to put all the attention master ingle token. But on the first layer we don't actually see highly confident heads. For all mortals e looked at, there was one head which was much more important than other heads. And when we looked at wat h doing awesaw that i nattention head, which
Transcript
Play full episode