
114 - Behavioral Testing of NLP Models, with Marco Tulio Ribeiro
NLP Highlights
00:00
How Do We Decentralize Evaluation?
How do we deal with the potential disconnect between what the user of a system wants and what the person building the system thinks is important? That's a great question. I'm glad you brought it up. My go with this paper was not to come up with something that only researchers could do. So i think unless we decentralize evaluation, this problem is never going to work. If you want stuff to be reusible, people have to be able to evalua e on their own saga.
Transcript
Play full episode