
Apache Spark (Pt. 2): MLlib - ML 074
Adventures in Machine Learning
00:00
Is Pyspark Going Down the Path of Better Reporting for Models?
In my experience, the gold standard is R. They have such easy, usable model outputs. The most unanswered answer there is ML Lib isn't going to natively have more stuff put into it to support this aspect. So from the Databricks side, with SparkML, we don't do that in SparkML. We're trying to minimize the amount of code that gets added to Spark. Instead we put that functionality in MLflow. And as of this last major release of MLflow and 1.25.1, we have model explainability where you can take a PySpark model from ML Lib. It'll detect the model type. It'll generate all of your charts for
Transcript
Play full episode