
Apache Spark (Pt. 2): MLlib - ML 074
Adventures in Machine Learning
00:00
Spark ML - Spark Model Generation and Validation
You can build most of it with the pipeline API in MLlib or in Spark ML. If you have more advanced, like interactions, like pairwise interactions that go beyond a second degree order, you're probably going to want to use stuff from sklearn. You can't cull an sklearn algorithm from a distributed Spark data frame. But then for the other parts of production, we're talking about model training or model selection. All that stuff can be done.
Transcript
Play full episode