
Data Debt in Machine Learning with D. Sculley - #574
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Addressing Data Debt in Machine Learning
This chapter delves into the issues surrounding data debt, emphasizing the dangers of unrepresentative datasets and bias in machine learning. It introduces concepts such as 'data sheets for data sets' and causal inference graphs, advocating for structured approaches to maintain data quality and transparency. The discussion also covers stress testing in large language models, highlighting methods to identify biases and improve robustness through the use of counterfactual data.
Transcript
Play full episode