The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Can We Trust Scientific Discoveries Made Using Machine Learning? with Genevera Allen - TWiML Talk #266

May 16, 2019

Genevera Allen, an associate professor of statistics at Rice University, shares her insights on trust in machine learning discoveries. She discusses the challenges of reproducibility in scientific research, especially in biomedical fields. Genevera reflects on her impactful talk at the AAAS conference, addressing audience reactions and future research directions. The conversation also emphasizes the importance of statistical methods in validating results and the need for better education and terminology in the application of machine learning to scientific research.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Cancer Subtype Reproducibility

In breast cancer, subtypes were successfully found using clustering, leading to targeted drug development.
However, similar efforts in other cancers haven't been consistently reproducible, raising concerns.

INSIGHT

Discovery vs. Prediction

Reproducibility in data-driven discovery differs from prediction, focusing on generalizable insights, not just outputs.
A key question is how to assess the generalizability of discoveries like clusters or feature importance.

ADVICE

Assessing Generalizability

Split data into training and test sets to check if discoveries hold.
Use the stability principle: repeatedly randomize training data and aggregate results to identify stable, reproducible discoveries.

Get the Snipd Podcast app to discover more snips from this episode

Get the app