Latent Space: The AI Engineer Podcast cover image

Latent Space: The AI Engineer Podcast

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Aug 10, 2023
Tianqi Chen, an Assistant Professor at CMU and the innovative mind behind XGBoost and Apache TVM, dives into the world of machine learning compilation. He discusses the urgent GPU shortage and explores how to run large language models on devices without needing GPUs at all. Highlights include the groundbreaking ability to execute a 70 billion parameter model in web browsers and advancements in AMD card support, making powerful AI accessible for developers. They also tackle the importance of weight quantization and community collaboration in optimizing machine learning tools.
52:10

Episode guests

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner