The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

Dec 26, 2023
In this discussion, Markus Nagel, a research scientist at Qualcomm AI Research, shares insights from his recent papers at NeurIPS 2023, focusing on machine learning efficiency. He tackles the challenges of quantizing transformers, particularly in minimizing outlier issues in attention mechanisms. The conversation explores the pros and cons of pruning versus quantization for model weight compression and dives into innovative methods for multitask and multidomain learning. Additionally, the use of geometric algebra in enhancing algorithms for robotics is highlighted.
46:49

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Quantizable Transformers address activation quantization issues introduced by attention mechanism.
  • Comparing the effectiveness of pruning and quantization methods for model weight compression.

Deep dives

Stable Diffusion: World's Fastest Diffusion Model on Mobile Devices

Qualcomm showcased a demo of stable diffusion, now running in under one second, making it the world's fastest diffusion model on mobile devices. This was achieved through full-stack AI optimizations, including model efficiency techniques such as quantization and knowledge distillation. Multi-stage knowledge distillation, efficient unit pruning, and guidance distillation were introduced to significantly improve the speed of stable diffusion. These optimizations reduced the compute and model size, streamlined the diffusion steps, and improved the overall performance.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode