Latent Space: The AI Engineer Podcast cover image

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast

CHAPTER

Running Massive AI Models in Browsers

This chapter explores the groundbreaking capability of executing a 70 billion parameter model directly in web browsers using advanced technologies like WebGPU. It discusses the necessary hardware, potential applications, and ongoing development efforts to enhance user experiences with interactive tools such as chatbots.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner