MLOps.community

Demetrios
undefined
Jul 23, 2022 • 1h 12min

Why You Need More Than Airflow // Ketan Umare // Coffee Sessions #109

Ketan Umare, Co-founder and CEO of Union.ai, shares insights from his extensive experience at Lyft, Oracle, and Amazon. He discusses the limitations of Airflow in machine learning, emphasizing the need for ML-specific orchestration tools. The conversation covers the complexities of data pipelines, the importance of effective feature management, and the challenges of model drift. Ketan also highlights cloud-native solutions, security in modern engineering, and innovative programming collaborations, all while offering book recommendations that tie historical lessons to today's tech landscape.
undefined
15 snips
Jul 19, 2022 • 1h 6min

ML Flow vs Kubeflow 2022 // Byron Allen // Coffee Sessions #108

MLOps Coffee Sessions #108 with Byron Allen, AI & ML Practice Lead at Contino, ML Flow vs Kubeflow 2022 co-hosted by George Pearse. // Abstract The amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game!   ML flow vs Kubeflow is more like comparing apples to oranges or as he likes to make the analogy they are both cheese but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer. // Bio Byron wears several hats. AI & ML practice lead, solutions architect, ML engineer, data engineer, data scientist, Google Cloud Authorized Trainer, and scrum master. He has a track record of successfully advising on and delivering data science platforms and projects. Byron has a mix of technical capability, business acumen, and communication skills that make me an effective leader, team player, and technology advocate.    See Byron write at https://medium.com/@byron.allen // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with George on LinkedIn: https://www.linkedin.com/in/george-pearse-b7a76a157/?originalSubdomain=uk Connect with Byron on LinkedIn: https://www.linkedin.com/in/byronaallen/ Timestamps: [00:00] Introduction to Byron Allen [01:10] Introduction to the new co-host George Pearse [01:41] ML Flow vs Kubeflow [05:40] George's take on ML Flow and Kubeflow [07:28] Writing in YAML [09:47] Developer experience [13:38] Changes in ML Flow and Kubeflow [17:58] Messing around ML Flow Serving [20:00] A taste of Kubeflow through K-Serve [23:18] Managed service of Kubeflow [25:15] How George used Kubeflow [27:45] Getting the Managed Service [31:30] Getting Authentication [32:41] ML Flow docs vs Kubeflow docs [36:59] Kubeflow community incentives [42:25] MLOps Search term [42:52] Organizational problem [43:50] Final thoughts on ML Flow and Kubeflow [49:19] Bonus [49:35] Entity-Centric Modeling [52:11] Semantic Layer options [57:27] Semantic Layer with Machine Learning [58:40] Satellite Infra Images demo [1:00:49] Motivation to move away from SQL [1:03:00] Managing SQL [1:05:24] Wrap up
undefined
10 snips
Jul 11, 2022 • 59min

Why and When to Use Kubeflow for MLOps // Ryan Russon // Coffee Sessions #107

MLOps Coffee Sessions #107 with Ryan Russon, Manager, MLOps and Data Science of Maven Wave Partners, Why and When to Use Kubeflow for MLOps co-hosted by Mihail Eric.   // Abstract Kubeflow is an excellent platform if your team is already leveraging Kubernetes and allows for a truly collaborative experience. Let’s take a deep dive into the pros and cons of using Kubeflow in your MLOps.   // Bio From serving as an officer in the US Navy to Consulting for some of America's largest corporations, Ryan has found his passion in the enablement of Data Science workloads for companies and teams.      Having spent years as a data scientist, Ryan understands the types of challenges that DS teams face in scaling, tracking, and efficiently running their workloads.   // MLOps Jobs board   https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://www.mavenwave.com/ https://go.mlops.community/hFApDb --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Ryan on LinkedIn: https://www.linkedin.com/in/ryanrusson/ Timestamps: [00:00] Introduction to Ryan Russon [01:13] Takeaways [04:17] Bullish on KubeFlow! [06:23] KubeFlow in ML tooling [11:47] Kubeflow having its velocity [14:16] To Kubeflow or not to Kubeflow [18:25] KubeFlow ecosystem maturity [20:51] Alternatively starting from scratch? [23:11] Argo workflow vs KubeFlow pipelines [25:08] KubeFlow as an end-state for citizen data scientists [28:24] End-to-end workflow key players   [31:17] K-serve [33:41] KubeFlow on orchestrators [36:24] Natural transition to KubeFlow maturity [41:33] "Don't forget about the engineer cost." [42:21] KubeFlow to other "Flow brothers" trade-offs [46:12] Biggest MLOps challenge [49:52] Best practices around file structure [52:15] KubeFlow changes over the years and what to expect moving forward [55:52] Best-of-breed vision [57:54] Wrap up
undefined
9 snips
Jul 5, 2022 • 54min

Building a Culture of Experimentation to Speed Up Data-Driven Value // Delina Ivanova // MLOps Coffee Sessions #106

MLOps Coffee Sessions #106 with Delina Ivanova, Associate Director, Data of HelloFresh, Building a Culture of Experimentation to Speed Up Data-Driven Value co-hosted by Vishnu Rachakonda. // Abstract Supply chain/manufacturing are prime areas where the use of data science/analytics/ ML is underdeveloped, and experimentation is required to collect data and enable data-driven solutions. This talk encourages companies to conduct experiments and collect data over time in order to build accurate/scalable data-driven solutions. // Bio Delina has over 10 years of experience across data and analytics, consulting, and strategy with roles spanning financial services, public sector, and CPG industries. She is currently the Associate Director, Data & Insights at HelloFresh Canada where she leads a full-service data team, including data engineering, data science, and business intelligence and automation. She is also a Data Science and Machine Learning instructor in the professional development programs at the University of Toronto and the University of Waterloo. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links The Discourses of Epictetus book: https://www.amazon.com/Discourses-Epictetus/dp/1537427180 The Pyramid Principle: Logic in Writing and Thinking book by Barbara Minto: https://www.amazon.com/Pyramid-Principle-Logic-Writing-Thinking/dp/0273710516 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Delina on LinkedIn: https://www.linkedin.com/in/delina-ivanova/ Timestamps: [00:00] Introduction to Delina Ivanova [00:35] Takeaways [03:46] Looking for People to organize local Meetups! [04:30] Delina's career trajectories and growth to the corporate schema [10:02] Telling stories with data [13:23] Tricks for being a translator from the business side to data teams [15:32] Technical engineering management and Delina's day-to-day role [20:40] Giving up day-to-day individual contributing work and coding [23:33] Good leadership for technical work [31:05] Growing team growing productivity [32:55] Pressured to grow [35:23] HelloFresh [39:39] Challenges of e-commerce, CPG, Logistics, and grocery combined [41:08] Cultural differences [46:04] Rapid fire session [52:20] Wrap up
undefined
8 snips
Jul 1, 2022 • 1h 6min

Cleanlab: Labeled Datasets that Correct Themselves Automatically // Curtis Northcutt // MLOps Coffee Sessions #105

In this episode, Curtis Northcutt, CEO & Co-Founder of Cleanlab, discusses the importance of data-centric AI and the challenges of addressing noisy data. They also delve into the journey of Cleanlab in improving data labeling accuracy, the success of the startup in finding and correcting bad data, and the frustrations of bug smashing. Additionally, they explore the challenges of understanding the value and capabilities of AI tools and companies, as well as the hiring opportunities in DevRel and front-end engineering.
undefined
Jun 24, 2022 • 52min

MLOps + BI? // Maxime Beauchemin // MLOps Coffee Sessions #104

MLOps Coffee Sessions #104 with the creator of Apache Airflow and Apache Superset Maxime Beauchemin, Future of BI co-hosted by Vishnu Rachakonda. // Abstract // Bio Maxime Beauchemin is the founder and CEO of Preset. Original creator of Apache Superset.  Max has worked at the leading edge of data and analytics his entire career, helping shape the discipline in influential roles at data-dependent companies like Yahoo!, Lyft, Airbnb, Facebook, and Ubisoft. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.rungalileo.io/ Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney: https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Max on LinkedIn: https://www.linkedin.com/in/maximebeauchemin/ Timestamps: [00:00] Introduction to Maxime Beauchemin [01:28] Takeaways [03:42] Paradigm of data warehouse [06:38] Entity-centric data modeling [11:33] Metadata for metadata [14:24] Problem of data organization for a rapidly scaling organization [18:36] Machine Learning tooling as a subset or of its own [22:28] Airflow: The unsung hero of the data scientists [27:15] Analyzing Airflow [30:44] Disrupting the field [34:45] Solutions to the ladder problem of empowering exploratory work and mortals superpowers with data [38:04] What to watch out for when building for data scientists   [41:47] Rapid fire questions [51:12] Wrap up
undefined
4 snips
Jun 17, 2022 • 1h 5min

Making MLFlow // Lead MLFlow Maintainer Corey Zumar // MLOps Coffee Sessions #103

MLOps Coffee Sessions #103 with Corey Zumar, MLOps Podcast on Making MLflow co-hosted by Mihail Eric. // Abstract Because MLOps is a broad ecosystem of rapidly evolving tools and techniques, it creates several requirements and challenges for platform developers:   - To serve the needs of many practitioners and organizations, it's important for MLOps platforms to support a variety of tools in the ecosystem. This necessitates extra scrutiny when designing APIs, as well as rigorous testing strategies to ensure compatibility.   - Extensibility to new tools and frameworks is a must, but it's important not to sacrifice maintainability. MLflow Plugins (https://www.mlflow.org/docs/latest/plugins.html) is a great example of striking this balance.   - Open source is a great space for MLOps platforms to flourish. MLflow's growth has been heavily aided by: 1. meaningful feedback from a community of ML practitioners with a wide range of use cases and workflows & 2. collaboration with industry experts from a variety of organizations to co-develop APIs that are becoming standards in the MLOps space. // Bio Corey Zumar is a software engineer at Databricks, where he’s spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master’s degree in computer science from UC Berkeley. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Corey on LinkedIn: https://www.linkedin.com/in/corey-zumar/ Timestamps: [00:00] Origin story of MLFlow [02:12] Spark as a big player [03:12] Key insights [04:42] Core abstractions and principles on MLFlow's success [07:08] Product development with open-source [09:29] Fine line between competing principles [11:53] Shameless way to pursue collaboration [12:24] Right go-to-market open-source [16:27] Vanity metrics [18:57] First gate of MLOps drug [22:11] Project fundamentals [24:29] Through the pillars [26:14] Best in breed or one tool to rule them all [29:16] MLOps space mature with the MLOps tool [30:49] Ultimate vision for MLFlow [33:56] Alignment of end-users and business values [38:11] Adding a project abstraction separate from the current ML project [42:03] Implementing bigger bets in certain directions [44:54] Log in features to experiment page [45:46] Challenge when operationalizing MLFlow in their stack [48:34] What would you work on if it weren't MLFlow? [49:52] Something to put on top of MLFlow [51:42] Proxy metric [52:39] Feature Stores and MLFlow [54:33] Lightning round [57:36] Wrap up
undefined
Jun 10, 2022 • 52min

Fixing Your ML Data Blind Spots // Yash Sheth // MLOps Coffee Sessions #102

MLOps Coffee Sessions #102 with Yash Sheth, Fixing Your ML Data Blindspots co-hosted by Adam Sroka.   // Abstract Improving your dataset quality is absolutely critical for effective ML. Finding errors in your datasets is generally a slow, iterative, and painstaking process.     Data scientists should be proactively fixing their model’s blindspots by improving their training data. In this talk, Yash discusses how Galileo helps data scientists identify, fix, and track data across the entire ML workflow.   // Bio Co-founder and VP of Engineering. Prior to starting Galileo, Yash spent the last decade working on Automatic Speech Recognition (ASR) at Google, leading their core speech recognition platform team, that powers speech-to-text across 20+ products at Google in over 80 languages along with thousands of businesses through their Cloud Speech API.   // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.rungalileo.io/ Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney: https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/ Connect with Yash on LinkedIn: https://www.linkedin.com/in/yash-sheth-72111216/ Timestamps: [00:00] Introduction to Yash Sheth [02:53] Takeaways [04:35] Why unstructured data? [06:59] Fitting in the workflow [10:56] Digging into the different pains [18:23] Vision around the democratization of machine learning [24:31] Unstructured data problem [25:49] Galileo handling unified tools [27:21] Calculus for ML [28:45] Gatekeep [29:49] Synthetic data in the unstructured data world of Galileo [33:10] Tips for data scientists that have unstructured data but with a small data set [35:00] Benefits of users from Galileo [37:15] Business case for dummies [42:36] War stories [44:49] Rapid fire questions [50:55] Wrap up
undefined
Jun 3, 2022 • 59min

Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team // Piero Molino // MLOps Coffee Sessions #101

MLOps Coffee Sessions #101 with Piero Molino, Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team co-hosted by Vishnu Rachakonda. // Abstract Declarative Machine Learning Systems are the next step in the evolution of Machine Learning infrastructure. With such systems, organizations can marry the flexibility of low-level APIs with the simplicity of AutoML. Companies adopting such systems can increase the speed of machine learning development, reaching the quality and scalability that only big tech companies could achieve until now, without the need for a team of several thousand people. Predibase is the turnkey solution for adopting declarative ML systems at an enterprise scale. // Bio Piero Molino is CEO and co-founder of Predibase, a company redefining ML tooling. Most recently, he has been Staff Research Scientist at Stanford University working on Machine Learning systems and algorithms in Prof. Chris Ré's' Hazy group. Piero completed a Ph.D. in Question Answering at the University of Bari, Italy. Founded QuestionCube, a startup that built a framework for semantic search and QA. Worked for Yahoo Labs in Barcelona on learning to rank, IBM Watson in New York on natural language processing with deep learning, and then joined Geometric Intelligence, where he worked on grounded language understanding. After Uber acquired Geometric Intelligence, Piero became one of the founding members of Uber AI Labs. At Uber, he worked on research topics including Dialogue Systems, Language Generation, Graph Representation Learning, Computer Vision, Reinforcement Learning, and Meta-Learning. He also worked on several deployed systems like COTA, an ML and NLP model for Customer Support, Dialogue Systems for driver's hands-free dispatch, the Uber Eats Recommender System with graph learning and collusion detection. He is the author of Ludwig, a Linux-Foundation-backed open source declarative deep learning framework. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: http://w4nderlu.st http://ludwig.ai https://medium.com/ludwig-ai Declarative Machine Learning Systems paper By Piero Molino, Christopher Ré: https://cacm.acm.org/magazines/2022/1/257445-declarative-machine-learning-systems/fulltext Slip of the Keyboard by Sir Terry Pratchett: https://www.terrypratchettbooks.com/books/a-slip-of-the-keyboard/ The Listening Society book series by Hanzi Freinacht: https://www.amazon.com/Listening-Society-Metamodern-Politics-Guides-ebook/dp/B074MKQ4LR --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Piero on LinkedIn: https://www.linkedin.com/in/pieromolino/?locale=en_US
undefined
May 27, 2022 • 24min

Scaling Real-time Machine Learning at Chime // Peeyush Agarwal // Lightning Sessions #1

Lightning Sessions #1 with Peeyush Agarwal, Scaling Real-time Machine Learning at Chime. // Abstract In this Lighting Talk, Peeyush Agarwal explains 2 key pieces of the ML infrastructure at Chime. Peeyush goes into detail about the current feature store design and feature monitoring process along with the ML monitoring setup. This Lighting Talk is brought to you by arize.com reach out to them for all of your ML monitoring needs. // Bio Peeyush Agarwal is the Lead Software Engineer, ML Platform at Chime. He leads the team which enables data science all the way from exploration, model development, and training to orchestrating batch and real-time models in shadow and production. Earlier, Peeyush was a founding engineer in Chime's DSML team and worked on both building models and getting them into production. Before Chime, Peeyush was a software engineer at Google where he developed unsupervised ML models that run on Google's data across search, Chrome, YouTube, and other properties to identify intent and use it for personalized ads and recommendations. At Google, he also worked on ML-powered Adaptive Brightness and Adaptive Battery which were launched into Android. Prior to joining Google, Peeyush was an entrepreneur who founded a customer engagement platform that counted Aurelia, Reebok, W, and Red Chief among its clients. // MLOps Jobs board   https://mlops.pallet.xyz/jobs // Related Links arize.com --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Peeyush on LinkedIn: https://www.linkedin.com/in/apeeyush/ Timestamps: [00:00] Introduction to Peeyush Agarwal [01:08] Agenda [01:27] What Chime is and what Chime do [01:44] Chime's products [02:27] Data Science and Machine Learning at Chime [08:06] Chime's first real-time model [08:09] Preventing fraud on Pay Friends [11:01] Feature Store: Unblock real-time capability   [12:40] Preventing fraud on Pay Friends: Monitoring [13:35] Preventing fraud on Pay Friends: Instrumentation [14:36] Monitoring: 4 diverse ways to triage [15:27] Examples of Metrics: Feature and Model Metrics [16:39] Scaling Real-time ML at Chime [17:09] Scaling Real-time ML: Monitoring and Alerting [18:28] Scaling Real-time ML: Build tools [20:13] Scaling Real-time ML: Infrastructure Orchestration [21:36] Scaling Real-time ML: Lessons

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app