The Self Attention Layer of the Transform Performer.

We use the sam architecture here. We first have a cross attention attend to the tars o modality, and the wek, we have a serfor attention layer. In bataf model, you would first have crossed attention layer, and then you would have aramodering layer to prosess the fused information better. So the serfited tention is place of this modelling. But we could consider it as a two transformtcoders in parlie.

Play episode from 11:10

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app