
Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference
Generally Intelligent
00:00
The Importance of Feedback in Inference
I used to focus a lot more on the mathematically interesting aspect of the project. Now I think my beliefs that we should engage much more with practitioners and because they give us the right kind of feedback to get better in terms of developing our methods. How would you get the C? Tenx, X, three, Well, it also matters what you're asking. Are we going to latency, energy, proof of like what exactly are we talking about? Yeah.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.