Microsoft Research Podcast

Abstracts: NeurIPS 2024 with Weizhu Chen 

Dec 6, 2024

08:27

forum

Ask episode

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.

Read the paper

Get the code

Home Top podcasts Popular guests Top books