AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Negative Effects of Benchmarks on Performance
In a survey, 88% of people in NLP think that we're over emphasizing benchmarks. And I think this partly comes from this disillusionment of we build benchmarks. Yet, there are not many alternatives for assessing and measuring progress. So we continue to have these big scale, especially multitask benchmarks because you can maybe extract some kind of signal from those. There's still more juice that we can get out of benchmarks.