LessWrong (Curated & Popular)

“Anomalous Tokens in DeepSeek-V3 and r1” by henry

Jan 28, 2025
Dive into the wild world of anomalous tokens in the DeepSeek-V3 model! Discover how these unusual glitch tokens can produce bizarre responses and unexpected behaviors. Unearth the significance of fragment tokens and non-English anomalies, particularly in regional languages like Cebueno. Explore the intriguing tendencies of DeepSeek that lead to endless repetition and how this affects user interactions. Join a journey through the peculiarities of AI language models that challenge our understanding of text generation!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Fragment Tokens

  • Many DeepSeek tokens are "fragment tokens," meaningful only within larger strings.
  • These fragments are common in large vocabularies but warrant further study.
INSIGHT

\""" Token Interpretation

  • The token " """ in DeepSeek is often misinterpreted as Unicode symbols, acronyms with "M," or emojis.
  • Context clues can sometimes help DeepSeek interpret it correctly as a word.
INSIGHT

Non-English Tokens

  • Many non-English glitch tokens were found in DeepSeek, primarily in Cebuano and other Filipino languages.
  • Some tokens translate directly, while others become random words.
Get the Snipd Podcast app to discover more snips from this episode
Get the app