AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Train Computer Vision Models to Match Correct Captions
We collect and release a big annotated corpus of New Yorker cartoons. And we in the pixel setting use these annotations like the ones you're describing to train computer vision models. But in the description setting, we sort of invent that process by just handing the models the human author descriptions. GPT for gets around 65% accurate at this 5050 task. We might expect the human performance also to be lower exactly because humor is more subjective.