
120 - Evaluation of Text Generation, with Asli Celikyilmaz
NLP Highlights
00:00
The Cost of Human Evaluation in Text Generation
Most research that's done in text generation usually also has human evaluation. It almost seems like that's the gold standard for evaluation. Is it because the idea is that the end consumers of text generation systems are usually humans? I think so. In intrinsic evaluation, we ask people to evaluate the quality of the generated text,. For instance, machine translation might be one example. This way, if there are like n different type of criteria, we'll probably haven different type of intrinsic evaluations which is very rich. And then you can go back and evaluate how your models are doing with these experiments.
Transcript
Play full episode