The Gradient: Perspectives on AI

Kate Park: Data Engines for Vision and Language

26 snips
Mar 21, 2024
Kate Park, Director of Product at Scale AI, discusses the importance of data in AI systems, focusing on self-driving vehicles and NLP applications. The podcast explores challenges in model evaluation, expert AI trainers, and the role of humans in labeling tasks.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Data Engine's First Success

  • Kate Park built Tesla's data engine to improve model performance with data alone.
  • Its first success was unblocking the Navigate on Autopilot release by targeting edge cases in object and fork detection.
INSIGHT

Data Scaling and Plateaus

  • Data scaling experiments reveal how data volume correlates with performance improvements, often showing a plateau.
  • This helps determine when to shift focus from data to architectural improvements.
ADVICE

Prioritizing Data Improvements

  • Prioritize data improvements based on impact and resource allocation.
  • Address major issues before minor improvements, considering the cost and time of labeling.
Get the Snipd Podcast app to discover more snips from this episode
Get the app