
Exploratory Data Analysis (EDA) in Machine Learning - ML 075
Adventures in Machine Learning
00:00
How to Capture Nonlinear Relationships in ETL Data
For the vast majority of problems that are out there in the data science world, you're not going to have a full picture of everything that affects that problem state. If you don't know where the derivation is of some of these calculated columns, you could theoretically have label leakage or you could have compounding calculations. It becomes problematic and challenging to get a model that's going to not over fit like crazy to a very limited subset of features that you don't want it to actually learn.
Transcript
Play full episode