Polyglot data science is the idea that in order to get things done you might need to use multiple toolsMultiple languages really. These are situations where you want to leverage the command line and do parts of your computation on the command line. Spark apache spark has a pipe method where you can pass an entire data set right rdd through a command line tool so i think that is just um it is maybe it was just a fun little hack what the authors did I don't know.