Latent Space: The AI Engineer Podcast cover image

Latent Space: The AI Engineer Podcast

Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)

Jan 19, 2025
Join Amir Haghighat, co-founder of Baseten, and Yineng Zhang, lead software engineer at Baseten, as they dive into the groundbreaking DeepSeek v3 model. This model boasts 671 billion parameters and has shaken up LLM inference platforms. They unravel the complexities of deploying massive models, discuss the innovations of SGLang, and delve into the challenges of caching technologies. With insights on optimizing AI workflows and a clear manifesto for crucial applications, this conversation is a must-listen for AI enthusiasts!
01:00:04

Podcast summary created with Snipd AI

Quick takeaways

  • DeepSeek v3's launch showcases a leap in AI capabilities with its 671 billion parameters and advanced Mixture of Experts structure.
  • Deploying extensive models like DeepSeek v3 necessitates sophisticated infrastructure, exemplified by Base10's use of powerful H200 clusters for efficient operation.

Deep dives

DeepSeq V3 Launch and Specifications

The release of DeepSeq V3 marks a significant advancement in language models, characterized by its massive size of 671 billion parameters and fine-grained Mixture of Experts (MOE) structure. This model employs native FP8 mixed precision training, leveraging techniques such as multi-head latent attention and a new multi-token prediction objective, trained on an expansive dataset of 15 trillion tokens. As of January 2025, DeepSeq V3 is ranked seventh globally on the LM Arena leaderboard and stands as the best open weights model, showcasing a competitive score. This launch reflects a growing trend of large open weights models emerging from Chinese labs, reinforcing the continuous evolution within the AI landscape.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode