
How Does AI Work? (Robert Wright & Timothy Nguyen)
Robert Wright's Nonzero
00:00
Attention Is All You Need
A predecessor to the transformer architecture what are called recurrent neural networks. There's a version of this called LSTM's long short term memory we don't have to go into the details but a high level how are they fundamentally different than a transformer. What they do is they Unroll word by word so in other words what happens is if I want to do the cat is furry and I feed you the cat is and I want to predict furry what happens is it does a one time forward computation. So it's sort of not temporal in that sense I just feed you the entire sentence and then it gives you the output from that entire sentence.
Transcript
Play full episode