The Cloudcast

Training and Labeling Foundational AI Models

8 snips
Sep 20, 2023
In this podcast, Alex Ratner discusses labeling and training LLMs, the challenges faced by LLM today, and the importance of data labeling for training. The podcast also explores the process of fine-tuning foundational AI models and the significance of purpose-built AI models in scaling AI.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Snorkel's Academic Origins

  • Snorkel spun out of the Stanford AI Lab and emphasizes data-centric development.
  • The company and academic work focus on programmatic labeling and production-ready enterprise use cases.
INSIGHT

Data Is The New Interface For Models

  • AI development is shifting from model-centric to data-centric workflows as models grow opaque and large.
  • Labeling, sampling, filtering and curating data become the primary interface to edit model behavior.
INSIGHT

Ingredients Matter More Than The Recipe

  • Foundation models and training algorithms are rapidly standardizing while the data and ingredients determine performance.
  • Careful data sampling and filtering alone can produce state-of-the-art gains even with fixed models.
Get the Snipd Podcast app to discover more snips from this episode
Get the app