
67 - GLUE: A Multi-Task Benchmark and Analysis Platform, with Sam Bowman
NLP Highlights
00:00
The Future of Language Understanding Tasks
We were seriously considering adding tasks like question answering or translation where you have either large inputs or sort of a generation stage in the output. We want people to be able to use the models for core question pairs for MNLI, for Stanford sentiment that you would actually use if you wanted to solve these problems. What this meant though is that we're making evaluation pretty sort of effort intensive, pretty slow. And we're kind of hoping that with single sentence sentences per task, we're hitting a bit of a sweet spot as well.
Transcript
Play full episode