LessWrong (Curated & Popular) cover image

“What o3 Becomes by 2028” by Vladimir_Nesov

LessWrong (Curated & Popular)

00:00

Exploring the Value of Training Tokens and Computational Strategies

This chapter explores the importance of utilizing a vast quantity of training tokens, estimating the value of at least 50 trillion. It examines the impact of token repetition on model perplexity and discusses the balance between data quality and quantity, including the potential for diverse data formats in future training methods.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app