NLP Highlights cover image

123 - Robust NLP, with Robin Jia

NLP Highlights

00:00

The Importance of Label Unbalance in Active Learning

Yes so in general when we talk about label imbalance like I think the simplest thing to think about is just in binary classification. We at least want to train on somewhat balanced data sets that's just like kind of makes model training work better. But the actual distribution you care about might be extremely imbalanced and that like one label might be much more common than the other. In QA this does also arise whenever you think about things related to open domain question answering where there's tons of documents on the web very few of them perhaps none of them actually answer a question that the user has asked. So if you're trying to use a system to detect whether some new piece of writing is a duplicate

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app