NLP Highlights cover image

129 - Transformers and Hierarchical Structure, with Shunyu Yao

NLP Highlights

CHAPTER

The Scaler Position Incodi, Is Important for Former Languages?

For all this, like a dike processing transformer, to work, we actually assume a very special kind of position n coding. We call this skiller positionincodi, or skiller pe. So the idea is that you have thi, separate fixed dimension in your tokin bedding that ranges from dero to one. And also, experimentally, wi show that with that kind of skiller position in coding, transformers can actually learn from finance samples generalize to a longer lens of a imput. But with more traditional position in codings, like a neneable inbedings, or, you know, like furier fisures, casin and sin

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner