Frontmatter cover image

Lee Baker, Author of Getting Started With Statistics: A Series of Bitesize Guides For Beginners

Frontmatter

00:00

The Importance of Clean Data in Data Analysis

When we're talking about data, it's hugely messy. The biggest elephant in the room is dirty data. You've got to have perfectly clean data. And anybody that's done any data analysis or statistics knows this. There was somebody I was talking to a few years ago now who said their institute had to submit their data set to a national database for use by the public. But before they could do it, they had problems with the data. It wasn't perfectly clean. So they went to a company to have it cleaned before uploading it to this national database. They got their quotation. $72 million to clean this database. Oh my God, $72 million. That's really interesting

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app