Machine Learning Street Talk (MLST) cover image

Facebook Research - Unsupervised Translation of Programming Languages

Machine Learning Street Talk (MLST)

00:00

Understanding Word Piece Tokenization and Autoregressive Language Models

This chapter explores the concept of tokenization in language processing, specifically through word piece tokenization. It highlights the benefits of breaking words into smaller components and discusses the autoregressive decoding process that utilizes a language token for output conditioning.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app