
128 - Dynamic Benchmarking, with Douwe Kiela
NLP Highlights
The High-Level Trends in the Results of the MLI Paper
We're getting better at the average case right but as a computer scientist you also care about the worst case especially if you want to deploy these models in production or in important situations where they have a real effect on people's lives. The worst case is ultimately what matters for deployment so that is one of the things that we're measuring with this platform in the adversarial case. Yeah I have a couple more questions related to this but I guess they're mostly related to distributional shift which we'll talk about in a few minutes. Can you give us an overview of the results that the high level trends of what you saw in the experiments?
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.