

Neural Network Quantization and Compression with Tijmen Blankevoort - TWIML Talk #292
Aug 19, 2019
In this discussion, Tijmen Blankevoort, a staff engineer at Qualcomm, delves into the fascinating world of neural network compression and quantization. He explains how much ML models can be compressed without losing efficiency and outlines the best strategies for achieving this. The conversation also touches on the lottery ticket hypothesis, exploring how feature selection can optimize neural networks. Tijmen reveals challenges in automating compression, like error propagation, and introduces innovative data-free quantization techniques that enhance performance across various models.
AI Snips
Chapters
Transcript
Episode notes
Model Size
- Start with a smaller model architecture if possible.
- Only train a larger model and compress if it's 2-3x too big.
Lottery Ticket Hypothesis
- The lottery ticket hypothesis suggests training large networks is better.
- More parameters mean more chances to find optimal features.
Compressing ResNets
- Start with compression for ResNet architectures.
- Use tensor factorization or channel pruning algorithms.