The Retentive Network is a successor to transformer for large language models. It combines some of the attributes of recurrence with recurrent networks. The network could be massively paralyzed, but it's very much superior in every aspect.
Our 132nd episode with a summary and discussion of last week's big AI news!