
Vanishing Gradients
Episode 32: Building Reliable and Robust ML/AI Pipelines
Jul 27, 2024
Join Shreya Shankar, a UC Berkeley researcher specializing in human-centered data management systems, as she navigates the exciting world of large language models (LLMs). Discover her insights on the shift from traditional machine learning to LLMs and the importance of data quality over algorithm issues. Shreya shares her innovative SPaDE framework for improving AI evaluations and emphasizes the need for human oversight in AI development. Plus, explore the future of low-code tools and the fascinating concept of 'Habsburg AI' in recursive processes.
01:15:10
Episode guests
AI Summary
Highlights
AI Chapters
Episode notes
Podcast summary created with Snipd AI
Quick takeaways
- Shreya emphasizes that many challenges in ML stem from data management rather than algorithmic issues, highlighting the need for robust data preparation.
- The concept of data flywheels is crucial for enhancing LLM applications, advocating for continual iterative evaluation based on production data and human feedback.
Deep dives
Exploring the Groundwork of AI Pipelines
The podcast discusses the significance of building reliable AI pipelines, emphasizing that many challenges in machine learning stem from data management issues rather than purely algorithmic ones. Shreya Shankar highlights her experience as an ML engineer, noting that a majority of her job involved engineering and assessing data quality, while actual model training was minimal. This insight underlines the need for a robust focus on data preparation and engineering to ensure successful deployments of machine learning models. The discussion advocates for a broader recognition of data-centric AI approaches that prioritize data quality and management over mere model training.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.