
129 - Transformers and Hierarchical Structure, with Shunyu Yao
NLP Highlights
How Much Memory Do You Need to Process This Dyke Language?
We show that the can systematically process this apounded hih rac amprage of bonde dept. And not only in one way, but in two different ways. So a to think up dyk language with a k types of different brackets and dalimit of d. Weare showing that ashly transformers can ruce a memory of a per token. That is like a d times overhead ing turns off the membory.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.