AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Importance of State Space Models in Signal Processing
Is there a specific intuition for why this stacking worked? Like, it maybe starts to seem like some kind of sliding window type of approach where you've got a buffer that you're able to compare tokens. And we can actually prove that standard state space model wouldn't capture some of these multiplicative elements. But with the state space model, we can create kind of a store for kind of global memory for the entire sequence.