AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Train Models That Are Better on Blue Benchmark?
In a recent talk, you were talking basically about how having these benchmarks out and then having all these, you know, hundreds of people trying to work on it to make it better. And you talked about how it was similar basically to like,you know, p hacking, right? Like if you run a hundred different experiments, yeah, you're going to see that something is correlated with something else, but it not doesn't mean it necessarily is a meaningful relationship between the two. Yeah. So now a lot of papers and models reports say, yeah, we just train our free train models on short sequences. But sadly your application is, I don't know, human classification, which is