AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Keep Scaling Large Language Models When Data Runs Out
The title of the paper is how to keep scaling large language models when data runs out and you research paper trains 400 models with up to 9 billion parameters and 900 billion tokens. It's an extension of chinchilla scaling laws for pew data again pretty much all story in the title but to give some more context last year it was found basically this equation chinchilla on how long you should train a language model. Next we have AI research dives into the limitations and capabilities of transformer large language models empirically and theoretically on compositional tasks.