Generally Intelligent cover image

Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference

Generally Intelligent

00:00

The Future of Recurrent Models

The idea is that recurrent models don't need to store the entire history when they do inference. Recurrent models instead would compress the history into a fixed state vector. Some examples for example, RWKV, which is this recurrent model that some folks have trained up to 14 billion parameters that seem to perform on par with transformer. That's not as proven, but that's a direction around exciting that.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app