AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Test Your Data Analysis Pipeline
The way most data analysis proceeds is that it always starts in the context of at least one concrete data set. You have to try and as much as possible script your process. Most of the time the outputs that we produce from complex data analysis pipelines are not identical. They don't produce exactly the same files or exactly the same screen output each year. For trivial reasons like they include a date stamp or they include version numbers of various components of the software.