
Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference
Generally Intelligent
The Future of Recurrent Networks
I think transformers are here to stay at least for a while. Lots of infrastructure even on the level of software framework and hardware are very much tailored to transformers. So I don't think anything is going to replace, completely replace them very soon. But we'll see new approaches coming out simply because there are now new applications that require either long context or reasoning ability.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.