Latent Space: The AI Engineer Podcast cover image

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast

00:00

Running Massive AI Models in Browsers

This chapter explores the groundbreaking capability of executing a 70 billion parameter model directly in web browsers using advanced technologies like WebGPU. It discusses the necessary hardware, potential applications, and ongoing development efforts to enhance user experiences with interactive tools such as chatbots.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app