
Lee Baker, Author of Getting Started With Statistics: A Series of Bitesize Guides For Beginners
Frontmatter
00:00
The Importance of Clean Data in Data Analysis
When we're talking about data, it's hugely messy. The biggest elephant in the room is dirty data. You've got to have perfectly clean data. And anybody that's done any data analysis or statistics knows this. There was somebody I was talking to a few years ago now who said their institute had to submit their data set to a national database for use by the public. But before they could do it, they had problems with the data. It wasn't perfectly clean. So they went to a company to have it cleaned before uploading it to this national database. They got their quotation. $72 million to clean this database. Oh my God, $72 million. That's really interesting
Transcript
Play full episode