Latent Space: The AI Engineer Podcast cover image

Latent Space: The AI Engineer Podcast

State of the Art: Training >70B LLMs on 10,000 H100 clusters

Jun 25, 2024
In this engaging discussion, Jonathan Frankle, Chief AI Scientist at Databricks, and Josh Albrecht, CTO of Imbue, dive into groundbreaking advancements in AI. They unveil Imbue 70B, a model outperforming GPT-4o with significantly less data. The duo shares insights on the complexities of scaling GPU clusters and the importance of high-performance infrastructure. They also address evaluating language models and introduce innovative tools for hyperparameter tuning. Their expertise shines through as they explore the future of AI in coding and reasoning tasks.
01:21:49

Podcast summary created with Snipd AI

Quick takeaways

  • Emphasizing the advancements in AI models like Imbue 70B surpassing GPT-4o in reasoning benchmarks with less data usage.
  • Providing infrastructure scripts for high-performance training and an innovative cost-aware hyperparameter optimizer, CARBS.

Deep dives

Summary of the Introduction of John Franco and Josh Albrecht

John Franco and Josh Albrecht, previously featured as podcast guests, discuss recent developments related to Databricks and Mosaic. John is now Databricks' Chief AI Scientist, highlighting shifts in their roles since the previous episode.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner