
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
NLP Highlights
00:00
Interpolation Hyper Parameter
Do you think it would make sense to consider making the interpolation hyper parameter be more context dependent? I know you have static image, just decided once and static throughout in the model. And I think there's some work from, from CMU on efficient nearest neighbor language models where they did try to use a small network on top of BLM to predict the interpolation parameters. They saw that like it massively improves efficiency without sacrificing too much, too much in terms of performance.
Transcript
Play full episode