
How Does AI Work? (Robert Wright & Timothy Nguyen)
Robert Wright's Nonzero
00:00
Transformer's Attention Mechanism
Transformer's are doing hierarchical processing. No wonder they consume so much compute as they say in the business. The so, but is query like each word just has a value that is the query value and the key value and the value value. I mean, but but how do you assign these? You could you could like in this cartoon analogy again nobody really knows because these numbers are so hard to understand.
Transcript
Play full episode