Adventures in Machine Learning cover image

Apache Spark (Pt. 2): MLlib - ML 074

Adventures in Machine Learning

00:00

Spark ML - Spark Model Generation and Validation

You can build most of it with the pipeline API in MLlib or in Spark ML. If you have more advanced, like interactions, like pairwise interactions that go beyond a second degree order, you're probably going to want to use stuff from sklearn. You can't cull an sklearn algorithm from a distributed Spark data frame. But then for the other parts of production, we're talking about model training or model selection. All that stuff can be done.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app