The Artificial Human

Will AI Eat Itself?

9 snips
Jan 22, 2025
Julia Kemper, a data scientist at NYU who specializes in AI model outputs, and Shayne Longpre, a PhD candidate at MIT leading the Data Provenance Initiative, discuss the alarming concept of 'model collapse.' They explore how AI's reliance on AI-generated data risks homogenous and bland outputs. Kemper highlights the challenges in improving AI performance under such conditions, while Longpre emphasizes the crucial role of human curation in enhancing AI training data quality. Together, they envision a future where human creativity revitalizes AI’s capabilities.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Model Collapse Definition

  • Model collapse is a degenerative process where AI-generated data pollutes the training set of the next generation.
  • This leads to misperceptions of reality as AI models are trained on this polluted data.
ANECDOTE

Turkey Thanksgiving Example

  • Researchers asked an LLM to cook a turkey for Thanksgiving multiple times.
  • After four generations, the LLM responded with existential questions instead of cooking instructions.
INSIGHT

Regression to the Mean

  • AI models trained on increasingly average data lose their ability to generate diverse outputs.
  • This leads to a blander output, lacking the quirks and outliers present in human-generated data.
Get the Snipd Podcast app to discover more snips from this episode
Get the app