AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Harnessing Transformer for Video Understanding
Using transformer technology traditionally used for text generation, OpenAI is innovatively employing it for video processing. By embedding images into a latent space using an encoder, they extract the meaning of each image by creating a list of numbers. This is expanded by breaking down images into patches and tracking these patches over time across video frames, enabling a deeper understanding of video content.