Super Data Science: ML & AI Podcast with Jon Krohn

771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko

Apr 2, 2024

Machine learning expert Kirill Eremenko discusses decision trees, random forests, and the top gradient boosting algorithms: XGBoost, LightGBM, and CatBoost. Topics include advantages of XGBoost, LightGBM efficiency, and CatBoost's handling of categorical variables.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Ensemble Methods

Ensemble methods combine multiple models, often weak learners like decision trees.
They are effective because they capture nonlinear relationships and are quick to train.

INSIGHT

Decision Trees

Decision trees use if-else splits based on variables like income or age to make predictions.
A new customer's path through the tree determines the predicted outcome, like spending on candles.

ANECDOTE

Jelly Bean Analogy

Kirill Eremenko uses the analogy of guessing jelly beans in a jar at a fair to explain random forests.
Averaging individual guesses, like a random forest averages predictions, provides the best guess.

Get the Snipd Podcast app to discover more snips from this episode

Get the app