AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Speed Up the Attention Mechanism in Transformers
This is also a very, very good work dealing with how you can basically speed up the attention mechanism in transformers. That is like one of the main computational bottleneck in that it scales quadratically with respect to the sequence size. So what this works sort of does is it somehow decomposes this kind of attention mechanism over a sequence into a local and global part. And then you've got one composite slice transformer.