

Episode 36: Ari Morcos, DatologyAI: On leveraging data to democratize model training
Jul 11, 2024
Ari Morcos, the CEO of DatologyAI and former researcher at DeepMind and FAIR, dives into the fascinating world of data and deep learning. He explores the nuances of data quality, emphasizing the distinction between hard and bad data points. The conversation touches on the evolution of image representation models and the critical role of data selection for model training. Ari also warns against the careless use of synthetic data and discusses how careful curation can boost model performance. Overall, it's a deep dive into optimizing data for smarter AI.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9
Intro
00:00 • 2min
From Neurons to Networks: Bridging Neuroscience and Machine Learning
01:37 • 21min
Evolving Perspectives in Image Representation Models
22:21 • 24min
Navigating Data Complexities in Machine Learning
46:37 • 17min
Navigating Synthetic Data in Model Training
01:03:32 • 4min
Navigating Data Quality Challenges
01:07:41 • 14min
Rethinking Data Definitions in Language Modeling
01:21:50 • 4min
Enhancing Model Training Through Data Quality
01:25:54 • 7min
Concerns Over AI-Generated Data and Opportunities in Data Science
01:32:41 • 2min