
60 - FEVER: a large-scale dataset for Fact Extraction and VERification, with James Thorne
NLP Highlights
00:00
The Difference Between a Fever Task and a Lifestyle Hypothesis
We find if we ignore the requirement for correct evidence are classified as. That's the applies the right label about 52% of the time. We find that the major bottom that care is the evidence retrieval part of our system, rather than the classifier. But the way we go about training our classifier effects how it works in this noisy environment where we take evidence from an our system as well.
Transcript
Play full episode