Markus Nagel, research scientist at Qualcomm AI Research, discusses his accepted papers at NeurIPS 2023. Topics include tackling activation quantization issues, comparing pruning and quantization, using scalarization in multi-domain learning, applying geometric algebra with equivariance to transformers, and deductive verification of chain of thought reasoning.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Quantizable Transformers address activation quantization issues introduced by attention mechanism.
Comparing the effectiveness of pruning and quantization methods for model weight compression.
Deep dives
Stable Diffusion: World's Fastest Diffusion Model on Mobile Devices
Qualcomm showcased a demo of stable diffusion, now running in under one second, making it the world's fastest diffusion model on mobile devices. This was achieved through full-stack AI optimizations, including model efficiency techniques such as quantization and knowledge distillation. Multi-stage knowledge distillation, efficient unit pruning, and guidance distillation were introduced to significantly improve the speed of stable diffusion. These optimizations reduced the compute and model size, streamlined the diffusion steps, and improved the overall performance.
Fast AI Assistant: Running LLM on Mobile Devices Offline
Qualcomm demonstrated a recent LLM model, with 7 billion parameters, running entirely on a smartphone without any internet connection. This showcased the capabilities of on-device AI and the full-stack AI optimizations employed by Qualcomm. These optimizations included system optimizations, model efficiency techniques like quantization and knowledge distillation, compilation, and hardware acceleration on Qualcomm's AI engine and Hexagon MPU. This demo highlighted the potential of on-device AI for offline AI assistance.
On-Device Learning for Video Segmentation
Qualcomm presented a demo on on-device learning for video segmentation, showcasing the ability to perform real-time video segmentation on a mobile device. This demo demonstrated the power of on-device AI for real-time visual analysis and showed the effectiveness of full-stack AI optimizations, including model efficiency techniques and hardware acceleration. By performing video segmentation on the device itself, users can enjoy real-time, personalized, and privacy-preserving video editing and effects.
Generative Relighting: Enhancing Visual Effects
Qualcomm exhibited a demo on generative relighting, leveraging AI to enhance visual effects in real time on a mobile device. This demo showcased the combination of generative AI technologies with full-stack AI optimizations. By running complex generative models efficiently on the device, users can experience interactive and immersive visual effects without relying on cloud computing. The demo highlighted Qualcomm's commitment to on-device AI and the potential for advanced visual effects in various applications.
Today we’re joined by Markus Nagel, research scientist at Qualcomm AI Research, who helps us kick off our coverage of NeurIPS 2023. In our conversation with Markus, we cover his accepted papers at the conference, along with other work presented by Qualcomm AI Research scientists. Markus’ first paper, Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing, focuses on tackling activation quantization issues introduced by the attention mechanism and how to solve them. We also discuss Pruning vs Quantization: Which is Better?, which focuses on comparing the effectiveness of these two methods in achieving model weight compression. Additional papers discussed focus on topics like using scalarization in multitask and multidomain learning to improve training and inference, using diffusion models for a sequence of state models and actions, applying geometric algebra with equivariance to transformers, and applying a deductive verification of chain of thought reasoning performed by LLMs.
The complete show notes for this episode can be found at twimlai.com/go/663.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode