Generally Intelligent cover image

Generally Intelligent

Episode 36: Ari Morcos, DatologyAI: On leveraging data to democratize model training

Jul 11, 2024
Ari Morcos, the CEO of DatologyAI and former researcher at DeepMind and FAIR, dives into the fascinating world of data and deep learning. He explores the nuances of data quality, emphasizing the distinction between hard and bad data points. The conversation touches on the evolution of image representation models and the critical role of data selection for model training. Ari also warns against the careless use of synthetic data and discusses how careful curation can boost model performance. Overall, it's a deep dive into optimizing data for smarter AI.
01:34:19

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Ari Morcos emphasizes that strategic data management can yield performance improvements that defy traditional scaling expectations and cost projections.
  • Morcos's unique transition from neuroscience to AI underscores the significance of effective data representation in modeling cognitive processes.

Deep dives

Data Scaling and Performance

Correctly utilizing data can lead to significantly enhanced performance that surpasses standard scaling expectations. Traditional scaling laws suggest a diminishing return on performance as data increases, often resulting in slower and less sustainable improvements. This phenomenon raises concerns, as some projections predict exponential costs for advanced models, suggesting a disconnect from practical scalability. A more efficient approach to data management can promise faster improvements and cost efficiencies beyond these conventional scaling laws.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner