Data Brew by Databricks cover image

Data Brew by Databricks

Latest episodes

undefined
Jun 17, 2021 • 36min

Data Brew Season 2 Episode 6: AutoML

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more.Erin LeDell shares valuable insight on AutoML, what problems are best solved by it, its current limitations, and her thoughts on the future of AutoML. We also discuss founding and growing the Women in Machine Learning and Data Science (WiMLDS) non-profit.See more at databricks.com/data-brew
undefined
Jun 10, 2021 • 33min

Data Brew Season 2 Episode 5: ML Applications

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more.Good machine learning starts with high quality data. Irina Malkova shares her experience managing and ensuring high-fidelity data, developing custom metrics to satisfy business needs, and discusses how to improve internal decision making processes.See more at databricks.com/data-brew
undefined
May 13, 2021 • 33min

Data Brew Season 2 Episode 4: Hyperparameter and Neural Architecture Search

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more.Liam Li is a leading researcher in the fields of hyperparameter optimization and neural architecture search, and is the author of the seminal Hyperband paper. In this session, Liam discusses the evolution of hyperparameter optimization techniques and illustrates how every data scientist can benefit from neural architecture search. See more at databricks.com/data-brew
undefined
May 5, 2021 • 31min

Data Brew Season 2 Episode 3: Infrastructure for ML

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Adam Oliner discusses how to design your infrastructure to support ML, from integration tests to glue code, the importance of iteration, and centralized vs decentralized data science teams. He provides valuable advice for companies investing in ML and crucial lessons he’s learned from founding two companies.See more at databricks.com/data-brew
undefined
Apr 28, 2021 • 26min

Data Brew Season 2 Episode 2: Data Ethics

The podcast discusses topics such as data ethics, fair lending practices, adversarial debiasing, responsible AI, the power of SHAP in explaining models, and various figures in the field of data ethics.
undefined
Apr 22, 2021 • 31min

Data Brew Season 2 Episode 1: ML in Production

For our second season, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more.In the season opener, Matei Zaharia discusses how he entered the field of ML, best practices for productionizing ML pipelines, leveraging MLflow & the Lakehouse architecture for reproducible ML, and his current research in this field.See more at databricks.com/data-brew
undefined
Feb 18, 2021 • 40min

Data Brew Season 1 Episode 6: Journey of Big Data

Speakers discuss their personal journeys into big data, the advantages of using structured APIs and structured streaming, the importance of structured data and excitement for learning, the challenges and ethical issues in data management, and the challenges of conducting landline telephone polls and motivation for writing.
undefined
Jan 6, 2021 • 36min

Data Brew Season 1 Episode 5: Combining Machine Learning and MLflow with your Lakehouse

The podcast discusses how Quby leverages ML to extract value from their data lake in the energy industry. They explore using energy data to create data-driven services and the challenges of clustering algorithms. They also discuss less intrusive monitoring methods, data transformation for privacy compliance, and obtaining permission from users.
undefined
Dec 22, 2020 • 29min

Data Brew Season 1 Episode 4: BI on Data Lakes - Making it Real for Retail

In this session, we discuss the lessons learned with Lara Minor, Senior Enterprise Data Manager at Columbia Sportswear, on how her team achieved a 70% reduction in pipeline creation time. This had reduced ETL workload times from four hours with previous data warehouses to minutes enabling near real-time analytics. Her team migrated from multiple legacy data warehouses, run by individual lines of business, to a single scalable, reliable, performant data lake.See more at databricks.com/data-brew
undefined
Dec 6, 2020 • 26min

Data Brew Season 1 Episode 3: Demystifying Delta Lake

In this podcast, Michael Armbrust, the creator of Spark SQL, discusses the conception and evolution of Delta Lake, efficient querying and troubleshooting slow queries, optimizing performance and query speed, understanding partitioning and Z Order, and exciting features for data ingestion and schema handling in Delta Lake.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner