
129 - Transformers and Hierarchical Structure, with Shunyu Yao
NLP Highlights
Aranan, I Agree With Everything.
Long cate memory is like her one back there, but they're actually like indifferent back ters. So all the memory we talk about is a petoke per rater jus us to ad. The al question, do you can you get better bounds if you have more layers? Like you show that you can do this with with depth two? What if i have depth eight? Does it change my baance tes? A very interesting question.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.