
123 - Robust NLP, with Robin Jia
NLP Highlights
00:00
How to Train a Model on a Data Set
In general, I think squad is a good data set, it doesn't have much noise, but it's just kind of easy to learn. And so, you know, when you have a data set that looks like that, it's only natural that the model you train on it themselves will rely heavily on these sorts of easy-to-learn heuristics rather than actually doing real reading comprehension. The word salad approach led to lower, lower accuracies. But for the distract the natural distracting sentences, we just like we got crowd workers to basically write give us a few possibilities. There are many less candidates to try there.
Transcript
Play full episode