The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Neural Network Quantization and Compression with Tijmen Blankevoort - TWIML Talk #292

Aug 19, 2019

In this discussion, Tijmen Blankevoort, a staff engineer at Qualcomm, delves into the fascinating world of neural network compression and quantization. He explains how much ML models can be compressed without losing efficiency and outlines the best strategies for achieving this. The conversation also touches on the lottery ticket hypothesis, exploring how feature selection can optimize neural networks. Tijmen reveals challenges in automating compression, like error propagation, and introduces innovative data-free quantization techniques that enhance performance across various models.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Model Size

Start with a smaller model architecture if possible.
Only train a larger model and compress if it's 2-3x too big.

INSIGHT

Lottery Ticket Hypothesis

The lottery ticket hypothesis suggests training large networks is better.
More parameters mean more chances to find optimal features.

ADVICE

Compressing ResNets

Start with compression for ResNet architectures.
Use tensor factorization or channel pruning algorithms.

Get the Snipd Podcast app to discover more snips from this episode

Get the app