
114 - Behavioral Testing of NLP Models, with Marco Tulio Ribeiro
NLP Highlights
00:00
The Most Surprising Failure of a Commercial System?
Y: I thought a lot of the stuff that you showed in the paper was really interesting. Can we dig into some details? Y: For this particular one gugo had a failure rate of like 54%. We can talk about what that failure rate means in a second, but for a lot of examples that we tried, this did not work. Youould think that someone would have found that and fixed thats, right? But apparently noty.
Transcript
Play full episode