The TDDA library can be used to help people deduplicate data. It generates a dozen different kinds of constraints, but they're only limited by the duplicate stuff like that. You can always define a new field that encompasses some extra stuff that you want. If you have a set of columns that ought to add up to a total, TDDA won't tell you that the total is not correct.
Send us a text
Nick Radcliffe, data scientist and entrepreneur, talks to us about the importance of test your data. As software engineers we are familiar with test driven development. Test driven data analysis puts the same emphasis on validating and testing data for your AI app. We also dive into the Python library of the same name tdda.
Links:
Other libraries mentioned:
Don't forget the upcoming RSE conferences and the Hidden Ref event
Support the show
Thank you for listening! Merci de votre écoute! Vielen Dank für´s Zuhören!
Contact Details/ Coordonnées / Kontakt:
This podcast is licensed under the Creative Commons Licence: https://creativecommons.org/licenses/by-sa/4.0/