
Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference
Generally Intelligent
The Impact of Sparse Training on Language Models
We found some parameterization that involved sparsely that's expressive and hardware efficient. We were able to use that to train image classification models, let's say on ImageNet,. That gives actual walk-clock speed up while maintaining the same level of quality. Now these things are being used at the salmonova, which is a chip company. And lots of works are now building on the idea of sparse training.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.