Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0 cover image

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

2024 in Vision [LS Live @ NeurIPS]

Dec 22, 2024
In this engaging discussion, Isaac Robinson and Peter Robicheaux from Roboflow share insights on the latest trends and groundbreaking papers in computer vision for 2024. They highlight the shift towards video-based models like 'Sora' and advancements in real-time object detection. Vik Korrapati, founder of Moondream, presents challenges in developing vision language models and introduces a compact, pruned model. Together, they explore how these innovations can reshape the landscape of computer vision and enhance pre-trained model efficiencies.
57:25

Podcast summary created with Snipd AI

Quick takeaways

  • 2024 sees the rise of vision language models like GPT-40 and Claude 3, significantly enhancing AI's ability to process visual and textual data.
  • Innovations in video generation, particularly through tools like MAGVIT and Sora, demonstrate major advancements in coherent video sequence creation and tokenization techniques.

Deep dives

Vision Language Models Become Mainstream

2024 marks a significant shift as vision language models gain mainstream acceptance across various AI applications. This transition is highlighted by the emergence of numerous models like GPT-40, Claude 3, Gemini 1 and 2, Llama 3.2, and Mistral's PixTroll that now incorporate multimodal capabilities. This evolution signals a broader industry trend towards synergizing visual and textual data processing, enhancing the depth and versatility of AI models in handling complex tasks.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode