
Apache Spark (Pt. 2): MLlib - ML 074
Adventures in Machine Learning
00:00
How Do You Train and Optimize on Down Sample Data?
Down sampling is really effective for a variety of reasons. What percent of the time do you train and optimize on down sample data? And then when you sort of ramp it up to the big, full data set, you see trends reverse or coefficients change because new features were discovered that weren't fit during training process. More frequently than I would probably like to admit.
Transcript
Play full episode