Generally Intelligent cover image

Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference

Generally Intelligent

CHAPTER

The Future of Recurrent Models

The idea is that recurrent models don't need to store the entire history when they do inference. Recurrent models instead would compress the history into a fixed state vector. Some examples for example, RWKV, which is this recurrent model that some folks have trained up to 14 billion parameters that seem to perform on par with transformer. That's not as proven, but that's a direction around exciting that.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner