
635: The Perils of Manually Labeling Data for Machine Learning Models
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
The Arguments Against Handling Labeling
One of the issues with labeling fundamentally is that it is a non deterministic process. The second is inconsistent results when trying to resample fresh data and retrain. It's very slow to deal with drift, for instance, especially in adversarial problems. Then you go into some of the issues around the cost of handling labeling. There's capital. If I want to build a model that's not just like hot dog, not hot dog, if I'm trying to do something more interesting, like cancer, not cancer, or something like that, I can't just pull a random person off the side of the street.
Transcript
Play full episode