AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Using a Short Transformer in a Language Model?
Some short transformer thise models have shown some of the properties of this kind of memory. But it's very difficult to scale up those models because they require a lot of computation. We don't know yet how to make that model bek to a scale where they actual full to target very complex problem. There are now papers trying to deal with that and reduce the complexity to leneal size.