NLP Highlights cover image

67 - GLUE: A Multi-Task Benchmark and Analysis Platform, with Sam Bowman

NLP Highlights

00:00

The Diagnostic Data Set for GLUE: A Set of 1,000 Textual Entailment Examples

The diagnostic data set we have now is a set of about 1,000 textual entailment examples. We're looking at predicate argument structure and how well the models capture that. The performance on that data doesn't count toward your primary score. There's just something you can click on a system on the leaderboard,. You can see how it does on these categories likepredicate argument structure. And here you just evaluate your M&LI classifier, your multi-analyze classifier, on this data.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app