

Data Debt in Machine Learning with D. Sculley - #574
24 snips May 19, 2022
D. Sculley, a director on the Google Brain team known for his insights on technical debt in machine learning, dives into the evolving concept of data debt. He discusses the integral role data quality plays in data-centric AI and highlights common sources of data debt. The conversation touches on innovative strategies like causal inference graphs and stress testing for improving model robustness. Sculley also explores the community's proactive steps to mitigate these issues, emphasizing a shift towards more accountable data practices.
AI Snips
Chapters
Transcript
Episode notes
Bingo Game for Technical Debt Diagram
- D. Sculley's paper, "Hidden Technical Debt in Machine Learning Systems," became so popular that a bingo game was created around its diagram.
- This happened at the first Fumilcon conference about MLOps and AI platforms.
From Art to AI
- D. Sculley's path to machine learning was non-traditional, starting as an art major and then a teacher.
- He later pursued a PhD in machine learning, driven by his interest in learnability.
Data-Centric AI Focus
- Creating machine learning models has become easier due to improved infrastructure and tools.
- The main challenge now lies in gathering, curating, and ensuring appropriate training data.