AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Future of Data Pruning for Large Language Models
Aims to see if we can achieve faster than power loss scaling in the test loss with the data set size when do you think you'll see whether or not that's achievable. The future of training these models will be quality of the training data over quantity of the trainingData and also i think it will be very interesting from interpretability standpoint to understand which kinds of training examples these models care about.