
"SolidGoldMagikarp (plus, prompt generation)"
LessWrong (Curated & Popular)
00:00
GPT 2 and GPT 3 DaVinci Instruct Beta Tokenization Processes
Chat GPT can easily be made to produce the desired token string but it strongly resists producing it in isolation. This list of 133 candidate weird tokens is not meant to be definitive but should serve as a good starting point for exploration of these types of anomalous behaviors. The non-determinism at temperature zero we guess is caused by floating point errors during forward propagation possibly the not knowing what to do in quotes leads to maximum uncertainty so that logits for multiple completions are maximally close and hence these errors which despite a lack of documentation gpt insiders inform us are a known but rare phenomenon are more reliably produced.
Play episode from 30:01
Transcript


