In this episode, Dean speaks with Federico Bacci, a data scientist and ML engineer at Bol, the largest e-commerce company in the Netherlands and Belgium. Federico shares valuable insights into the intricacies of deploying machine learning models in production, particularly for forecasting problems. He discusses the challenges of model explainability, the importance of feature engineering over model complexity, and the critical role of stakeholder feedback in improving ML systems. Federico also offers a compelling perspective on why LLMs aren't always the answer in AI applications, emphasizing the need for tailored solutions. This conversation provides a wealth of practical knowledge for data scientists and ML engineers looking to enhance their understanding of real-world ML operations and challenges in e-commerce.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Introduction and Background
01:59 Owning the ML Pipeline
02:56 Deployment Process
05:58 Testing and Feedback
07:40 Different Deployment Strategies
11:19 Explainability and Feature Importance
13:46 Challenges in Forecasting
22:33 ML Stack and Tools
26:47 Orchestrating Data Pipelines with Airflow
31:27 Exciting Developments in ML
35:58 Recommendations and Closing
Links
Dwarkesh podcast with Anthropic and Gemini team members – https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken
➡️ Federico Bacci on LinkedIn – https://www.linkedin.com/in/federico-bacci/
➡️ Federico Bacci on Twitter – https://x.com/fedebyes
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://x.com/TheRealDAGsHub
➡️ Dean Pleban: https://x.com/DeanPlbn