The Future of Recurrent Models

The idea is that recurrent models don't need to store the entire history when they do inference. Recurrent models instead would compress the history into a fixed state vector. Some examples for example, RWKV, which is this recurrent model that some folks have trained up to 14 billion parameters that seem to perform on par with transformer. That's not as proven, but that's a direction around exciting that.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app