AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Unification of Video and Language Models
The way we represent videos right now is just more cumbersome. So that's been a big focus of our work actually, right? How to make video and language models much, much more efficient. For example, by using very specific keyframes and not all the frames. One of the things that is enabling kind of this next level of unification for multi-modality is kind of this emergence of transformers as a standard substrate for machine learning applications.