

912: In Case You Missed It in July 2025
43 snips Aug 8, 2025
Explore the importance of data-centric machine learning in legal tech, tackling noisy data challenges. Delve into low resource languages and the impactful DMLR initiative. Discover the shift from traditional to data-centric methods emphasizing dataset quality, particularly in finance. Uncover how neuroscience informs AI predictions about human behavior, enhancing business decisions. Finally, dive into causal AI's potential for predicting user actions in gaming, highlighting practical tools like PyTorch.
AI Snips
Chapters
Transcript
Episode notes
Episode 912 Highlight Reel
- Episode 912 curates standout interviews from July to help listeners catch up quickly.
- Jon Krohn frames the episode as a short highlights reel for busy data practitioners.
Noisy Legal Labels Sparked A Shift
- Lilith Bat-Leah found legal labels so noisy that she couldn't trust model evaluations.
- That motivated her shift from algorithm tuning to data-centric machine learning research.
Join The ML Commons Data Perf Group
- Join the ML Commons Data Perf working group to help build public low-resource language datasets, Lilith Bat-Leah recommends.
- Contribute if you work with data and want to improve ML training corpora.