The Impact of Sparse Training on Language Models

We found some parameterization that involved sparsely that's expressive and hardware efficient. We were able to use that to train image classification models, let's say on ImageNet,. That gives actual walk-clock speed up while maintaining the same level of quality. Now these things are being used at the salmonova, which is a chip company. And lots of works are now building on the idea of sparse training.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app