NLP Highlights cover image

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

NLP Highlights

00:00

The Self Attention Layer of the Transform Performer.

We use the sam architecture here. We first have a cross attention attend to the tars o modality, and the wek, we have a serfor attention layer. In bataf model, you would first have crossed attention layer, and then you would have aramodering layer to prosess the fused information better. So the serfited tention is place of this modelling. But we could consider it as a two transformtcoders in parlie.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app