The String Quotation Mark in Chat GPT

We had found that the same tokens confounded GPT three Da Vinci instruct beta but in more interesting ways. Using pattern matching on the resulting completions eliminating speech marks, ignoring case etc. We were able to eliminate all but a few thousand tokens. The vast majority having been repeated with no problem if occasionally capitalized or spelled out with hyphens between each letter. These problematic in quotes tokens were then separated into about 133 truly weird and 241 merely confused tokens.

Play episode from 28:02

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app