NLP Highlights

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

Feb 24, 2020
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Using Pre-Trained Transformers to Do Long Tail Image Processing
02:12 • 2min
3
The Multi-Modal Representation Is a Good Example
04:00 • 2min
4
Using Machine Translations to Resolve Ambiguities
05:42 • 2min
5
What Is the Language in Bedding?
07:38 • 2min
6
Cross Attention Blocks in Vision Language?
09:15 • 2min
7
The Self Attention Layer of the Transform Performer.
11:10 • 2min
8
How to Pretrend Your Model?
13:12 • 1min
9
The Intuition Behind Object Recognition Returning Tasks
14:41 • 1min
10
Do You Have a Future Regression Task?
16:11 • 2min
11
Using Image Captioning and Vecuaded Sets for Tree Transcription
17:47 • 2min
12
Is There Room for a Latent Alignment Model?
19:46 • 2min
13
Do You Have a Problem With Multiple Captions?
21:42 • 2min
14
Do You Have Any Overlapping Tasks in Your Moral Training?
23:32 • 2min
15
A Question About High Level Trancs in a Paper
25:28 • 2min
16
Can You Give a Quick Summary of Your Results?
27:11 • 2min
17
Is It Possible to Pretrain a Bird Like Modern?
29:03 • 2min
18
The Differences Between Lexbert and Other Multim Transforme Papers
30:47 • 2min
19
The Best Way to Train a Vision Plus Language and Coder?
32:22 • 2min
20
The Shall Teach the Children to Learn the Language
34:07 • 2min
21
Cosmodaligrams
35:44 • 2min