AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Does It Make Any Sense to Train on Generated Data?
To improve valuable performance systems, focus on addressing the limitations of scaling compute based on the length of a prompt and response. Currently, there is no user or architecture knob to adjust compute usage based on the difficulty of a problem. Training on generated data can amortize compute, but it doesn't consider the energy expenditure involved. A more efficient solution would be for the model to dynamically decide the compute usage at inference time, depending on the complexity of the problem. This would prevent unnecessary compute usage on simple problems and allocate more resources to challenging ones.