NLP Highlights cover image

128 - Dynamic Benchmarking, with Douwe Kiela

NLP Highlights

00:00

The Difficulty of Models in Different Domains

I think it's great that we have a platform for trying to answer these interesting research questions. I guess when I read these papers one question I keep thinking of is we know that we're building harder datasets and that's obvious given that the models are not doing very well but it's kind of hard to pinpoint why exactly these newer benchmarks are hard right. For example in your third round in the adversarial analyte pipeline you said you sample data from diverse domains which were different from the domain city sample from in the first two rounds. Is it because the reasoning involved in these newer examples hard or is it just that the domain is different and so the models are just not trained enough in the new

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app