Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0 cover image

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

How to train your own Large Multimodal Model — with Hugo Laurençon & Leo Tronchon of HuggingFace M4

Jan 19, 2024
01:11:50
Snipd AI
Hugo Laurençon and Leo Tronchon of HuggingFace M4 discuss training large multimodal models, the challenges of working with video data, image resolution in OCR tasks, and the importance of creating deduplication rules for the industry.
Read more

Podcast summary created with Snipd AI

Quick takeaways

  • Improving image resolution, using open-source models, and enhancing data quality are crucial in the development of multi-modal models.
  • The evaluation of multi-modal models faces challenges such as limited benchmarks and the need for better metrics and data quality.

Deep dives

Improving Multi-Modality with Smaller Models and Better Data

The team behind the latent space podcast discusses the progress made in multi-modality. They highlight the focus on improving image resolution, using better open-source models like C-lip and Mistral, and enhancing the data quality for training. By categorizing hallucinations into specific topics, such as object attributes, absent objects, and environmental inaccuracies, they aim to identify the areas that require further fine-tuning. Their goal is to match or surpass the performance of closed-source models like GPT-4V and develop specific data sets to target tasks like OCR and counting, which are challenging for multi-modal models.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode