
Chronos: Learning the Language of Time Series with Abdul Fatir Ansari - #685
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Quality and Diversity of Tokens in Language Models
Language models use high-quality and diverse tokens in training, with around 84 billion tokens used. This diversity is crucial as each piece of data brings unique information. On the other hand, time series forecasting may have an over-representation of certain types of data, leading to less diverse characteristics. To address this limitation, augmentation schemes are employed to enhance performance.
Play episode from 18:11
Transcript


