
How Does AI Work? (Robert Wright & Timothy Nguyen)
Robert Wright's Nonzero
00:00
The Transformer Architecture for Large Language Model Success
If we only had three dimensions and these models use a lot more, then we might think that words are being located in a kind of a three dimensional semantic map. That would be true if there were only three dimensions but it turns out there's a whole lot of dimension so we can't quite conceive of the map. With more dimensions more parameters you can model more more things right you could be more expressive. There's more room to move typically these words live in a thousand dimensional spaceThe transformer architecture underlying the recent language model success is that you can condition these factors on the context. So it's context dependent embedding and that's a large part of why these models are so powerful.
Transcript
Play full episode