The Real Python Podcast cover image

Becoming More Effective at Manipulating Data With Pandas

The Real Python Podcast

00:00

Cleaning Data for Machine Learning

Cleaning data is a big part of the book. A lot of people are working from c s pfiles, comma, separated value files which are great and that they're human readable but that's about extent of the greatness. If you have a column that has some entry that's not numeric, then pantas will load it, but it will not convert it to a number. And so stepping through those, you know, making sure that they are numeric. Another problem is missing data, you know. And data can be missing for various reasons, so that can be a process in and of itself,. Just to figure out why it's missing and then figure out the correct way of either

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app