
A primer on AI for developers with Swyx from Latent Space
Svelte Radio
00:00
How to Use Openia to Improve Document Search
The default GPT-3 context is 4,000 tokens. A token is essentially a set sequence of words. You just sort of serialize them into specific numbers. So for example, in the word like Brittany, Brittany might be two tokens, right? And those things always represent Brittany. There's also very fun tricks that have arisen based out of the corpus of data that this tokenizer was trained on. If you go to platform.openia.com and then you look for a tokenizer, you can actually just punch in words and see those numbers for yourself.
Transcript
Play full episode