AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Unfairness of Data in Machine Learning
I think that the people working in machine learning didn't feel empowered to get their own data sets. And so they would just use whatever data sets got invented. So I felt like they put way more like process around the labeling than was really necessary. They're like, we should get like a smaller amount of high quality data. But if you're trying to make like an application, you know, you might be willing to sacrifice some of the annotation pieces to collect data or high volume for sure.