NLP Highlights cover image

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

NLP Highlights

00:00

What Is the Language in Bedding?

An image is naturally a two dimensional ray, that you have heat and wites. S for the language in bedding is just a sequence of word and witys position in bedding. The tooth we use here is an objective detector that the object detector tries to detect some meaninful object in the image. It's just some rectangles on the image which contains some meniful objects, labels, or something like this. Then we just use thise object as the input of the future. And the position imbeddings, just as a languade. So this is the general idea of observation.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app