
Apache Spark (Pt. 2): MLlib - ML 074
Adventures in Machine Learning
00:00
Is Downsampling a Good Thing?
Down sampling your data so that you can iterate quickly and work with like very like more usable libraries. I've noticed that whenever I'm working with very large data sets, even if it takes 10 minutes to train or 20 minutes to train, that's fine. That's the best you can do. But if you can decrease training time and results time, I would argue that's one of the most valuable things you can do as a modeler.
Transcript
Play full episode