Efficiency of Approximate Matrix Multiplication Using Vector Quantization

This chapter explores a novel method of approximate matrix multiplication through vector quantization, reducing the number of operations needed for efficient computation. It compares this technique with binary neural networks, highlighting the benefits of vector quantization's expressivity and flexibility in leveraging mutual information across parameters.

Play episode from 16:00

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Join us at our first in-person conference on June 25, all about AI Quality: https://www.aiqualityconference.com/

Huge thank you to ⁠Databricks⁠ AI for sponsoring this episode.

Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing.

Davis Blalock is a Research Scientist and the first employee of Mosaic ML: a GenAI startup acquired for $1.3 billion by Databricks.MLOps podcast #219 with Databricks' Engineering Manager, Bandish Shah and Research Scientist Davis Blalock, The Art and Science of Training Large Language Models.

// Abstract

What's hard about language models at scale? Turns out...everything. MosaicML's Davis and Bandish share war stories and lessons learned from pushing the limits of LLM training and helping dozens of customers get LLMs into production. They cover what can go wrong at every level of the stack, how to make sure you're building the right solution, and some contrarian takes on the future of efficient models.

// Bio

Bandish Shah

Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing. Bandish has over a decade of experience building systems for machine learning and enterprise applications. Prior to MosaicML, Bandish held engineering and development roles at SambaNova Systems where he helped develop and ship the first RDU systems from the ground up, and Oracle where he worked as an ASIC engineer for SPARC-based enterprise servers.

Davis Blalock

Davis Blalock is a research scientist at MosaicML. He completed his PhD at MIT, advised by Professor John Guttag. His primary work is designing high-performance machine learning algorithms. He received his M.S. from MIT and his B.S. from the University of Virginia. He is a Qualcomm Innovation Fellow, NSF Graduate Research Fellow, and Barry M. Goldwater Scholar.

// MLOps Jobs board

jobs.mlops.community

// MLOps Swag/Merch

https://mlops-community.myshopify.com/

// Related Links

AI Quality In-person Conference: AI Quality in Person Conference: https://www.aiqualityconference.com/

Website: http://databricks.com/Davis Summarizes Papers ⁠Newsletter signup linkDavis' Newsletters: Learning to recognize spoken words from five unlabeled examples in under two seconds: https://arxiv.org/abs/1609.09196

Training on data at 5GB/s in a single thread: https://arxiv.org/abs/1808.02515

Nearest-neighbor searching through billions of images per second in one thread with no indexing: https://arxiv.org/abs/1706.10283

Multiplying matrices 10-100x faster than a matrix multiply (with some approximation error): https://arxiv.org/abs/2106.10860

Hidden Technical Debt in Machine Learning Systems: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf

--------------- ✌️Connect With Us ✌️ -------------

Join our Slack community: https://go.mlops.community/slack

Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/

Connect with Davis on LinkedIn: https://www.linkedin.com/in/dblalock/

Connect with Bandish on LinkedIn: https://www.linkedin.com/in/bandish-shah/

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books