LessWrong (Curated & Popular) cover image

“Anomalous Tokens in DeepSeek-V3 and r1” by henry

LessWrong (Curated & Popular)

00:00

Intro

This chapter investigates the unusual behavior of anomalous tokens within the DeepSeek V3 and R1 models. It outlines the methodology for identifying these glitch tokens and categorizes them, particularly highlighting the distinct characteristics of 'fragment tokens.'

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app