AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Best Strategy for Pruning Your Data
The best strategy for pruning your data set depends on how much data you have. If you follow this optimal pruning strategy what we find is that you scale much faster than those power laws I was describing before. The really big returns the really big savings in compute or tons of CO2 will start to come once you look at really big data sets with billions of examples or trillions of tokens, he says. "We're training some of these very very large language models trillion parameter language models on very very large data sets"