
75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III
NLP Highlights
00:00
The Cost of Initial Learning for NLP
The incumbent techniques these days are sequence to sequence models and their variance trained on maximum likelihood loss. And so now it's really a question of, is the test time behavior of this model being substantially hurt by the fact that it gets into areas of the search space or it basically makes errors that it doesn't know how to recover from? So if you have a part of speech tagger that's getting 95% accuracy, is this worth doing? Probably not. You should just stick with your maximum likelihood trained thing A because there's not much headroom,. If your thing is 95% accurate, 95% of the time it's making the same predictions as the expert anyway.
Transcript
Play full episode