
128 - Dynamic Benchmarking, with Douwe Kiela
NLP Highlights
Dynamic Benchmarks and Models in the Loop of Benchmark Creation
The idea is humans and models in the loop I think if you're familiar with thinking about adversarial processes people would often have some sort of algorithm or model try to adversarial find mistakes that models make. What we're doing here is slightly different because the adversary is not an algorithm or model it's actually an under human so a human is talking to a model and their job is essentially to find things that the model cannot yet do. If we keep doing this over time the idea is that we again we get good metrics so we see how well these models doWe also collect a lot of data that can then be used for training so that we get even better models which we can then
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.