
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
NLP Highlights
00:00
Can and LM Models - How Do They Work?
The Can and LM formulation separates the problems of representation learning or text similarity and next word prediction. The keys in our data store will be the context representations, which we get by doing forward passes with our language model over the data that needs to be memorized. And because the same value appears multiple times in the data store, we could end up with duplicate values in our retrieved set as well. In this way, we basically get a canon distribution over the same vocabulary that the Lm distribution output is over. Where any word from the vocab that doesn't appear in the retrieved set just gets a probability of zero.
Transcript
Play full episode