AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is There a Future for a Visual Question Answering System?
We've done a bunch of experimentations with, as you say, can we have some kind of detector classified to understand what's going to be interesting right now? And quickly we realize that when you have the human in the loop, it's really this idea of, can we know what the user intent is? And that turns out to be a much harder problem. So we've opted for giving the user much more fine gran control. But there's still an area i'm very interested in, and i guess the other end of that spectrum is to attempt to read out everything that's in the scene at a given time. We have really great image captioning technology now, but