Latent Space: The AI Engineer Podcast

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

13 snips
Aug 10, 2023
Tianqi Chen, an Assistant Professor at CMU and the innovative mind behind XGBoost and Apache TVM, dives into the world of machine learning compilation. He discusses the urgent GPU shortage and explores how to run large language models on devices without needing GPUs at all. Highlights include the groundbreaking ability to execute a 70 billion parameter model in web browsers and advancements in AMD card support, making powerful AI accessible for developers. They also tackle the importance of weight quantization and community collaboration in optimizing machine learning tools.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Sketchbook Designs

  • Tianqi Chen uses sketchbooks to design and diagram his projects.
  • He has filled several books with design details, especially for TVM.
ANECDOTE

XGBoost's Unexpected Success

  • XGBoost was created as a byproduct of testing if alternative models could outperform deep learning with enough data.
  • The initial hypothesis was wrong, but XGBoost became unexpectedly popular.
INSIGHT

Tree-Based Models Remain Relevant

  • Tree-based models are still relevant for tabular data due to their interpretability and ease of use.
  • They offer advantages like handling different input scales and automatic feature composition.
Get the Snipd Podcast app to discover more snips from this episode
Get the app