Super Data Science: ML & AI Podcast with Jon Krohn

912: In Case You Missed It in July 2025

43 snips
Aug 8, 2025
Explore the importance of data-centric machine learning in legal tech, tackling noisy data challenges. Delve into low resource languages and the impactful DMLR initiative. Discover the shift from traditional to data-centric methods emphasizing dataset quality, particularly in finance. Uncover how neuroscience informs AI predictions about human behavior, enhancing business decisions. Finally, dive into causal AI's potential for predicting user actions in gaming, highlighting practical tools like PyTorch.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Episode 912 Highlight Reel

  • Episode 912 curates standout interviews from July to help listeners catch up quickly.
  • Jon Krohn frames the episode as a short highlights reel for busy data practitioners.
ANECDOTE

Noisy Legal Labels Sparked A Shift

  • Lilith Bat-Leah found legal labels so noisy that she couldn't trust model evaluations.
  • That motivated her shift from algorithm tuning to data-centric machine learning research.
ADVICE

Join The ML Commons Data Perf Group

  • Join the ML Commons Data Perf working group to help build public low-resource language datasets, Lilith Bat-Leah recommends.
  • Contribute if you work with data and want to improve ML training corpora.
Get the Snipd Podcast app to discover more snips from this episode
Get the app