Chapters 
 Transcript 
 Episode notes 
 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21 
 Introduction 
 00:00 • 2min 
 Using Pre-Trained Transformers to Do Long Tail Image Processing 
 02:12 • 2min 
 The Multi-Modal Representation Is a Good Example 
 04:00 • 2min 
 Using Machine Translations to Resolve Ambiguities 
 05:42 • 2min 
 What Is the Language in Bedding? 
 07:38 • 2min 
 Cross Attention Blocks in Vision Language? 
 09:15 • 2min 
 The Self Attention Layer of the Transform Performer. 
 11:10 • 2min 
 How to Pretrend Your Model? 
 13:12 • 1min 
 The Intuition Behind Object Recognition Returning Tasks 
 14:41 • 1min 
 Do You Have a Future Regression Task? 
 16:11 • 2min 
 Using Image Captioning and Vecuaded Sets for Tree Transcription 
 17:47 • 2min 
 Is There Room for a Latent Alignment Model? 
 19:46 • 2min 
 Do You Have a Problem With Multiple Captions? 
 21:42 • 2min 
 Do You Have Any Overlapping Tasks in Your Moral Training? 
 23:32 • 2min 
 A Question About High Level Trancs in a Paper 
 25:28 • 2min 
 Can You Give a Quick Summary of Your Results? 
 27:11 • 2min 
 Is It Possible to Pretrain a Bird Like Modern? 
 29:03 • 2min 
 The Differences Between Lexbert and Other Multim Transforme Papers 
 30:47 • 2min 
 The Best Way to Train a Vision Plus Language and Coder? 
 32:22 • 2min 
 The Shall Teach the Children to Learn the Language 
 34:07 • 2min 
 Cosmodaligrams 
 35:44 • 2min 

