Latent Space: The AI Engineer Podcast cover image

How to train your own Large Multimodal Model — with Hugo Laurençon & Leo Tronchon of HuggingFace M4

Latent Space: The AI Engineer Podcast

00:00

Navigating the Challenges of Video Data in Model Training

This chapter explores the difficulties in developing models for processing video data, including issues with multiple frames and variable video lengths. It highlights the heavier computational demands of video datasets compared to text and images, and discusses the current emphasis on image-text models over video models.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app