The way most data analysis proceeds is that it always starts in the context of at least one concrete data set. You have to try and as much as possible script your process. Most of the time the outputs that we produce from complex data analysis pipelines are not identical. They don't produce exactly the same files or exactly the same screen output each year. For trivial reasons like they include a date stamp or they include version numbers of various components of the software.
Send us a text
Nick Radcliffe, data scientist and entrepreneur, talks to us about the importance of test your data. As software engineers we are familiar with test driven development. Test driven data analysis puts the same emphasis on validating and testing data for your AI app. We also dive into the Python library of the same name tdda.
Links:
Other libraries mentioned:
Don't forget the upcoming RSE conferences and the Hidden Ref event
Support the show
Thank you for listening! Merci de votre écoute! Vielen Dank für´s Zuhören!
Contact Details/ Coordonnées / Kontakt:
This podcast is licensed under the Creative Commons Licence: https://creativecommons.org/licenses/by-sa/4.0/