1min snip

Eye On A.I. cover image

Yann LeCun: Filling the Gap in Large Language Models

Eye On A.I.

NOTE

Self-Attention in Neural Turing Machines

One person called the stack of mounted memory network. Another person called key value memory network and then a whole bunch of things./nThese use associative memories that are the basic modules that are used inside the transformers./nAttention mechanism like this were popularized in around 2015 by a paper from the Yoshua Bengiros group at Mila./nAnd demonstrated that they are extremely powerful for doing things like translation language translation in NLP./nThis started the craze on attention./nAnd so you come on all these ideas and you get a transformer that uses something called self attention, where the input tokens are used both as queries and keys in a associative memory very much like a memory network./nAnd then you view this as layer if you want you put several of those in the layer and then you stack those layers and that's what the transformer is.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode