MLOps.community

Demetrios

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

Episodes

Mentioned books

Jul 23, 2022 • 1h 12min

Building a Culture of Experimentation to Speed Up Data-Driven Value // Delina Ivanova // MLOps Coffee Sessions #106

MLOps Coffee Sessions #106 with Delina Ivanova, Associate Director, Data of HelloFresh, Building a Culture of Experimentation to Speed Up Data-Driven Value co-hosted by Vishnu Rachakonda. // Abstract Supply chain/manufacturing are prime areas where the use of data science/analytics/ ML is underdeveloped, and experimentation is required to collect data and enable data-driven solutions. This talk encourages companies to conduct experiments and collect data over time in order to build accurate/scalable data-driven solutions. // Bio Delina has over 10 years of experience across data and analytics, consulting, and strategy with roles spanning financial services, public sector, and CPG industries. She is currently the Associate Director, Data & Insights at HelloFresh Canada where she leads a full-service data team, including data engineering, data science, and business intelligence and automation. She is also a Data Science and Machine Learning instructor in the professional development programs at the University of Toronto and the University of Waterloo. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links The Discourses of Epictetus book: https://www.amazon.com/Discourses-Epictetus/dp/1537427180 The Pyramid Principle: Logic in Writing and Thinking book by Barbara Minto: https://www.amazon.com/Pyramid-Principle-Logic-Writing-Thinking/dp/0273710516 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Delina on LinkedIn: https://www.linkedin.com/in/delina-ivanova/ Timestamps: [00:00] Introduction to Delina Ivanova [00:35] Takeaways [03:46] Looking for People to organize local Meetups! [04:30] Delina's career trajectories and growth to the corporate schema [10:02] Telling stories with data [13:23] Tricks for being a translator from the business side to data teams [15:32] Technical engineering management and Delina's day-to-day role [20:40] Giving up day-to-day individual contributing work and coding [23:33] Good leadership for technical work [31:05] Growing team growing productivity [32:55] Pressured to grow [35:23] HelloFresh [39:39] Challenges of e-commerce, CPG, Logistics, and grocery combined [41:08] Cultural differences [46:04] Rapid fire session [52:20] Wrap up

8 snips

Jul 1, 2022 • 1h 6min

Cleanlab: Labeled Datasets that Correct Themselves Automatically // Curtis Northcutt // MLOps Coffee Sessions #105

In this episode, Curtis Northcutt, CEO & Co-Founder of Cleanlab, discusses the importance of data-centric AI and the challenges of addressing noisy data. They also delve into the journey of Cleanlab in improving data labeling accuracy, the success of the startup in finding and correcting bad data, and the frustrations of bug smashing. Additionally, they explore the challenges of understanding the value and capabilities of AI tools and companies, as well as the hiring opportunities in DevRel and front-end engineering.

Jun 24, 2022 • 52min

MLOps + BI? // Maxime Beauchemin // MLOps Coffee Sessions #104

MLOps Coffee Sessions #104 with the creator of Apache Airflow and Apache Superset Maxime Beauchemin, Future of BI co-hosted by Vishnu Rachakonda. // Abstract // Bio Maxime Beauchemin is the founder and CEO of Preset. Original creator of Apache Superset. Max has worked at the leading edge of data and analytics his entire career, helping shape the discipline in influential roles at data-dependent companies like Yahoo!, Lyft, Airbnb, Facebook, and Ubisoft. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.rungalileo.io/ Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney: https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Max on LinkedIn: https://www.linkedin.com/in/maximebeauchemin/ Timestamps: [00:00] Introduction to Maxime Beauchemin [01:28] Takeaways [03:42] Paradigm of data warehouse [06:38] Entity-centric data modeling [11:33] Metadata for metadata [14:24] Problem of data organization for a rapidly scaling organization [18:36] Machine Learning tooling as a subset or of its own [22:28] Airflow: The unsung hero of the data scientists [27:15] Analyzing Airflow [30:44] Disrupting the field [34:45] Solutions to the ladder problem of empowering exploratory work and mortals superpowers with data [38:04] What to watch out for when building for data scientists [41:47] Rapid fire questions [51:12] Wrap up

4 snips

Jun 17, 2022 • 1h 5min

Making MLFlow // Lead MLFlow Maintainer Corey Zumar // MLOps Coffee Sessions #103

MLOps Coffee Sessions #103 with Corey Zumar, MLOps Podcast on Making MLflow co-hosted by Mihail Eric. // Abstract Because MLOps is a broad ecosystem of rapidly evolving tools and techniques, it creates several requirements and challenges for platform developers: - To serve the needs of many practitioners and organizations, it's important for MLOps platforms to support a variety of tools in the ecosystem. This necessitates extra scrutiny when designing APIs, as well as rigorous testing strategies to ensure compatibility. - Extensibility to new tools and frameworks is a must, but it's important not to sacrifice maintainability. MLflow Plugins (https://www.mlflow.org/docs/latest/plugins.html) is a great example of striking this balance. - Open source is a great space for MLOps platforms to flourish. MLflow's growth has been heavily aided by: 1. meaningful feedback from a community of ML practitioners with a wide range of use cases and workflows & 2. collaboration with industry experts from a variety of organizations to co-develop APIs that are becoming standards in the MLOps space. // Bio Corey Zumar is a software engineer at Databricks, where he’s spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master’s degree in computer science from UC Berkeley. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Corey on LinkedIn: https://www.linkedin.com/in/corey-zumar/ Timestamps: [00:00] Origin story of MLFlow [02:12] Spark as a big player [03:12] Key insights [04:42] Core abstractions and principles on MLFlow's success [07:08] Product development with open-source [09:29] Fine line between competing principles [11:53] Shameless way to pursue collaboration [12:24] Right go-to-market open-source [16:27] Vanity metrics [18:57] First gate of MLOps drug [22:11] Project fundamentals [24:29] Through the pillars [26:14] Best in breed or one tool to rule them all [29:16] MLOps space mature with the MLOps tool [30:49] Ultimate vision for MLFlow [33:56] Alignment of end-users and business values [38:11] Adding a project abstraction separate from the current ML project [42:03] Implementing bigger bets in certain directions [44:54] Log in features to experiment page [45:46] Challenge when operationalizing MLFlow in their stack [48:34] What would you work on if it weren't MLFlow? [49:52] Something to put on top of MLFlow [51:42] Proxy metric [52:39] Feature Stores and MLFlow [54:33] Lightning round [57:36] Wrap up

Jun 10, 2022 • 52min

Fixing Your ML Data Blind Spots // Yash Sheth // MLOps Coffee Sessions #102

MLOps Coffee Sessions #102 with Yash Sheth, Fixing Your ML Data Blindspots co-hosted by Adam Sroka. // Abstract Improving your dataset quality is absolutely critical for effective ML. Finding errors in your datasets is generally a slow, iterative, and painstaking process. Data scientists should be proactively fixing their model’s blindspots by improving their training data. In this talk, Yash discusses how Galileo helps data scientists identify, fix, and track data across the entire ML workflow. // Bio Co-founder and VP of Engineering. Prior to starting Galileo, Yash spent the last decade working on Automatic Speech Recognition (ASR) at Google, leading their core speech recognition platform team, that powers speech-to-text across 20+ products at Google in over 80 languages along with thousands of businesses through their Cloud Speech API. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.rungalileo.io/ Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney: https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/ Connect with Yash on LinkedIn: https://www.linkedin.com/in/yash-sheth-72111216/ Timestamps: [00:00] Introduction to Yash Sheth [02:53] Takeaways [04:35] Why unstructured data? [06:59] Fitting in the workflow [10:56] Digging into the different pains [18:23] Vision around the democratization of machine learning [24:31] Unstructured data problem [25:49] Galileo handling unified tools [27:21] Calculus for ML [28:45] Gatekeep [29:49] Synthetic data in the unstructured data world of Galileo [33:10] Tips for data scientists that have unstructured data but with a small data set [35:00] Benefits of users from Galileo [37:15] Business case for dummies [42:36] War stories [44:49] Rapid fire questions [50:55] Wrap up

Jun 3, 2022 • 59min

Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team // Piero Molino // MLOps Coffee Sessions #101

MLOps Coffee Sessions #101 with Piero Molino, Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team co-hosted by Vishnu Rachakonda. // Abstract Declarative Machine Learning Systems are the next step in the evolution of Machine Learning infrastructure. With such systems, organizations can marry the flexibility of low-level APIs with the simplicity of AutoML. Companies adopting such systems can increase the speed of machine learning development, reaching the quality and scalability that only big tech companies could achieve until now, without the need for a team of several thousand people. Predibase is the turnkey solution for adopting declarative ML systems at an enterprise scale. // Bio Piero Molino is CEO and co-founder of Predibase, a company redefining ML tooling. Most recently, he has been Staff Research Scientist at Stanford University working on Machine Learning systems and algorithms in Prof. Chris Ré's' Hazy group. Piero completed a Ph.D. in Question Answering at the University of Bari, Italy. Founded QuestionCube, a startup that built a framework for semantic search and QA. Worked for Yahoo Labs in Barcelona on learning to rank, IBM Watson in New York on natural language processing with deep learning, and then joined Geometric Intelligence, where he worked on grounded language understanding. After Uber acquired Geometric Intelligence, Piero became one of the founding members of Uber AI Labs. At Uber, he worked on research topics including Dialogue Systems, Language Generation, Graph Representation Learning, Computer Vision, Reinforcement Learning, and Meta-Learning. He also worked on several deployed systems like COTA, an ML and NLP model for Customer Support, Dialogue Systems for driver's hands-free dispatch, the Uber Eats Recommender System with graph learning and collusion detection. He is the author of Ludwig, a Linux-Foundation-backed open source declarative deep learning framework. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: http://w4nderlu.st http://ludwig.ai https://medium.com/ludwig-ai Declarative Machine Learning Systems paper By Piero Molino, Christopher Ré: https://cacm.acm.org/magazines/2022/1/257445-declarative-machine-learning-systems/fulltext Slip of the Keyboard by Sir Terry Pratchett: https://www.terrypratchettbooks.com/books/a-slip-of-the-keyboard/ The Listening Society book series by Hanzi Freinacht: https://www.amazon.com/Listening-Society-Metamodern-Politics-Guides-ebook/dp/B074MKQ4LR --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Piero on LinkedIn: https://www.linkedin.com/in/pieromolino/?locale=en_US

May 27, 2022 • 24min

Scaling Real-time Machine Learning at Chime // Peeyush Agarwal // Lightning Sessions #1

Lightning Sessions #1 with Peeyush Agarwal, Scaling Real-time Machine Learning at Chime. // Abstract In this Lighting Talk, Peeyush Agarwal explains 2 key pieces of the ML infrastructure at Chime. Peeyush goes into detail about the current feature store design and feature monitoring process along with the ML monitoring setup. This Lighting Talk is brought to you by arize.com reach out to them for all of your ML monitoring needs. // Bio Peeyush Agarwal is the Lead Software Engineer, ML Platform at Chime. He leads the team which enables data science all the way from exploration, model development, and training to orchestrating batch and real-time models in shadow and production. Earlier, Peeyush was a founding engineer in Chime's DSML team and worked on both building models and getting them into production. Before Chime, Peeyush was a software engineer at Google where he developed unsupervised ML models that run on Google's data across search, Chrome, YouTube, and other properties to identify intent and use it for personalized ads and recommendations. At Google, he also worked on ML-powered Adaptive Brightness and Adaptive Battery which were launched into Android. Prior to joining Google, Peeyush was an entrepreneur who founded a customer engagement platform that counted Aurelia, Reebok, W, and Red Chief among its clients. // MLOps Jobs board https://mlops.pallet.xyz/jobs // Related Links arize.com --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Peeyush on LinkedIn: https://www.linkedin.com/in/apeeyush/ Timestamps: [00:00] Introduction to Peeyush Agarwal [01:08] Agenda [01:27] What Chime is and what Chime do [01:44] Chime's products [02:27] Data Science and Machine Learning at Chime [08:06] Chime's first real-time model [08:09] Preventing fraud on Pay Friends [11:01] Feature Store: Unblock real-time capability [12:40] Preventing fraud on Pay Friends: Monitoring [13:35] Preventing fraud on Pay Friends: Instrumentation [14:36] Monitoring: 4 diverse ways to triage [15:27] Examples of Metrics: Feature and Model Metrics [16:39] Scaling Real-time ML at Chime [17:09] Scaling Real-time ML: Monitoring and Alerting [18:28] Scaling Real-time ML: Build tools [20:13] Scaling Real-time ML: Infrastructure Orchestration [21:36] Scaling Real-time ML: Lessons

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app