MLOps.community  cover image

MLOps.community

Latest episodes

undefined
Aug 6, 2022 • 1h 2min

Building Better Data Teams // Leanne Fitzpatrick // Coffee Sessions #113

MLOps Coffee Sessions #113 with Leanne Fitzpatrick, Director of Data Science of Financial Times, Building Better Data Teams co-hosted by Mihail Eric. // Abstract We spent a lot of time talking about data tooling but we maybe spent not as much time talking about data organizations and efficiently running and organizing data teams.    What about starting with limitations instead of aspirations? Right constraints instead of the north star? In this session, let's learn more about a realistic take on the state of data organizations of today. // Bio Leanne is Director of Data Science at the Financial Times and is a passionate data leader with experience building and developing empowered data science and analytics teams in a variety of businesses. Leanne is in her element when developing and implementing strategic, technical, and cultural solutions to getting machine learning and data science into the operational ecosystem. Leanne is an active part of the data and technology community, sharing innovation and insights to encourage best practices, from Manchester, UK to Austin, TX, and is an Advisory Panel Board Member. Outside of all things data you can ask Leanne about her golf swing (it’s not good - yet), her passion for American Football (specifically the Cincinnati Bengals), her latest sewing project, and her love for good music, food, and whisky. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Children of Time book by Adrian Tchaikovsky:   https://www.amazon.com/Children-Time-Adrian-Tchaikovsky/dp/0316452505 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Leanne on LinkedIn: https://www.linkedin.com/in/leanne-kim-fitzpatrick-29204341/ Timestamps: [00:00] Introduction to Leanne Fitzpatrick [04:23] Write us your suggestions! [05:43] Tri-pawed dog called Seaweed! [08:43] How to architect data teams [14:44] Organizational deficiencies [19:19] Tensions and conflicts for starters [24:07] Misunderstandings from marketing [25:59] The Middle Layer [28:48] Data science work at publications [31:11] Mystique of going to real-time [35:29] Third parties with fraud [37:40] Augmenting data practitioners with third-party tools [41:00] Principle of reinventing the wheel and avoiding undifferentiated heavy lifting   [46:29] Different Abstraction Layer recommendations [48:42] RN Production [51:56] Will Python eats RN Production away? [56:05] Julia as a dark horse [56:39] Future of RN Production [58:00] Rapid fire questions
undefined
Aug 3, 2022 • 50min

MLX: Opinionated ML Pipelines in MLflow // Xiangrui Meng // Coffee Sessions #112

MLOps Coffee Sessions #112 with Xiangrui Meng, Principal Software Engineer of Databricks, MLX: Opinionated ML Pipelines in MLflow co-hosted by Vishnu Rachakonda. // Abstract MLX is to enable data scientists to stay mostly within their comfort zone utilizing their expert knowledge while following the best practices in ML development and delivering production-ready ML projects, with little help from production engineers and DevOps. // Bio Xiangrui Meng is a Principal Software Engineer at Databricks and an Apache Spark PMC member. His main interests center around simplifying the end-to-end user experience of building machine learning applications, from algorithms to platforms and to operations. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Good Strategy Bad Strategy: The Difference and Why It Matters book by Richard Rumelt: https://www.amazon.com/Good-Strategy-Bad-Difference-Matters/dp/0307886239 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Xiangrui on LinkedIn: https://www.linkedin.com/in/mengxr/ Timestamps: [00:00] Introduction to Xiangrui Meng [00:39] Takeaways [02:09] Xiangrui's background [03:38] What kept Xiangrui in Databricks [07:33] What needs to be done to get there [09:20] Machine Learning passion of Xiangrui [11:52] Changes in building that keep you fresh for the future [14:35] Evolution core challenges to real-time and use cases in real-time [17:33] DevOps + DataOps + ModelOps = MLOps [19:21] MLFlow Support [21:37] Notebooks to production debates   [25:42] Companies tackling Notebooks to production [27:40] MLOoops stories [31:03] Opinionated MLOps productionizing in a good way [40:23] Xiangrui's MLOps Vision [44:47] Lightning round [48:45] Wrap up
undefined
Jul 30, 2022 • 49min

More than a Cache: Turning Redis into a Composable, ML Data Platform // Samuel Partee // Coffee Sessions #111

MLOps Coffee Sessions #111 with Samuel Partee, Principal Applied AI Engineer of Redis, More than a Cache: Turning Redis into a Composable, ML Data Platform co-hosted by Mihail Eric. This episode is sponsored by Redis. // Abstract Pushing forward the Redis platform to be more than just the web-serving cache that we've known it up to now. It seems like a natural progression for the platform, we see how they're evolving to be this AI-focused, AI native serving platform that does vector similarity, feature stored provides those kinds of functionalities. // Bio A Principal Applied AI Engineer at Redis, Sam helps guide the development and direction of Redis as an online feature store and vector database.    Sam's background is in high-performance computing including ML-related topics such as distributed training, hyperparameter optimization, and scalable inference. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://partee.io Redis VSS demo: https://github.com/Spartee/redis-vector-search Redis Stack: https://redis.io/docs/stack/ Github - https://github.com/Spartee   OSS org Sam co-founded at HPE/Cray - https://github.com/CrayLabs This paper last year was some of the best research and collaborations Sam has been a part of. The Paper is published here: https://www.sciencedirect.com/science/article/pii/S1877750322001065?via%3Dihub Do you really need an extra database for vectors? https://databricks.com/dataaisummit/session/emerging-data-architectures-approaches-real-time-ai-using-redis Blink: The Power of Thinking Without Thinking by Malcolm Gladwell,  Barry Fox,  Irina Henegar (Translator): https://www.goodreads.com/book/show/40102.Blink --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Sam on LinkedIn: www.linkedin.com/in/sam-partee-b04a1710a Timestamps: [00:00] Introduction to Samuel Partee [00:24] Takeaways [02:46] Updates on the Community [05:17] Start of Redis [08:10] Vision for Vector Search [11:05] Changing the narrative going from the "Cache" for all servers and web endpoints [14:35] Clear value prop on demos [20:17] Vector Database [26:26] Features with benefits [28:41] AWS Spend [30:39] Vector Database upsell model and bureaucratic convenience   [32:08] Distributed training hyperparameter optimization and scalable inference [35:03] Core infrastructural advancement [36:55] Tools movement to help [39:00] Using Machine Learning at scale in numerical simulations with SmartSim: An application to ocean climate modeling (published paper) [42:52] Future applications of tech to get excited with [44:20] Lightning round [47:48] Wrap up
undefined
Jul 29, 2022 • 52min

Just Fetch the Data and then... // David Bayliss // Coffee Sessions #110

MLOps Coffee Sessions #110 with David Bayliss, Chief Data Scientist of LexisNexis Risk Solutions, Just Fetch the Data and then... co-hosted by Vishnu Rachakonda. // Abstract Composing data to extract features can be a significant problem. Key factors are the data size, compliance restrictions, and real-time data. Ethics (and law) can drive extremely complex audit requirements. In the cloud, you can do anything - at a price. // Bio One of the creators of the world's first big data platform (HPCC);  David has been tackling big data problems for two decades. A mathematician, compiler writer, and data sponge with more than five dozen patents spanning platforms linking, and search. Most inventors think outside the box; David can't even remember where the box is. He leads the team that creates their core Data Science methods used by hundreds of data scientists. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Interesting insight in this post. Would be cool to learn from David about his view on things https://www.google.com/url?q=https://www.linkedin.com/posts/david-bayliss-426556a_datascience-platform-portability-activity-6913448643303759872-2dqq?utm_source%3Dlinkedin_share%26utm_medium%3Dmember_desktop_web&sa=D&source=calendar&ust=1649078059106132&usg=AOvVaw26wAevExeEfW_AdZSA8UhF --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with David on LinkedIn: https://www.linkedin.com/in/david-bayliss-426556a/ Timestamps: [00:00] Introduction to David Bayliss [01:03] Takeaways [04:56] LexisNexis and David's role [07:15] Evolution of LexisNexis in 20 years with so many use cases [08:51] Role of David in structuring data for working with data change [14:32] Data management and data access [17:45] Unique challenges of scale, use case, and diversity at LexisNexis [24:47] Tardis Iron Box [30:05] Iron Box translation [32:56] JVM for data science [34:24] Iron Box meaning [36:52] Metadata with PII [39:08] Detrimental privacy / Hairy Kneecap Theory [40:57] Speeding things up and Anonymized linking [46:47] What kept David working at LexisNexis? [50:30] Wrap up
undefined
Jul 23, 2022 • 1h 12min

Why You Need More Than Airflow // Ketan Umare // Coffee Sessions #109

MLOps Coffee Sessions #109 with Ketan Umare, Co-founder and CEO of Union.ai, Why You Need More Than Airflow co-hosted by George Pearse. // Abstract Airflow is a beloved tool by data engineers and Machine Learning Engineers alike. But when doing ML what are the shortcomings and why is an orchestration tool like that not always the best developer experience? In this episode, we break down what some key drivers are for using an ML-specific orchestration tool. // Bio Ketan Umare is the CEO and co-founder at Union.ai. Previously he had multiple Senior roles at Lyft, Oracle, and Amazon ranging from Cloud, Distributed storage, Mapping (map-making), and machine-learning systems. He is passionate about building software that makes engineers' lives easier and provides simplified access to large-scale systems. Besides software, he is a proud father, and husband, and enjoys traveling and outdoor activities. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Zero to One: Notes on Startups, or How to Build the Future Hardcover by Peter Thiel  and Blake Masters: https://www.amazon.com/Zero-One-Notes-Startups-Future/dp/0804139296 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with George on LinkedIn: https://www.linkedin.com/in/george-pearse-b7a76a157/?originalSubdomain=uk Connect with Ketan on LinkedIn: https://www.linkedin.com/in/ketanumare/
undefined
Jul 19, 2022 • 1h 6min

ML Flow vs Kubeflow 2022 // Byron Allen // Coffee Sessions #108

MLOps Coffee Sessions #108 with Byron Allen, AI & ML Practice Lead at Contino, ML Flow vs Kubeflow 2022 co-hosted by George Pearse. // Abstract The amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game!   ML flow vs Kubeflow is more like comparing apples to oranges or as he likes to make the analogy they are both cheese but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer. // Bio Byron wears several hats. AI & ML practice lead, solutions architect, ML engineer, data engineer, data scientist, Google Cloud Authorized Trainer, and scrum master. He has a track record of successfully advising on and delivering data science platforms and projects. Byron has a mix of technical capability, business acumen, and communication skills that make me an effective leader, team player, and technology advocate.    See Byron write at https://medium.com/@byron.allen // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with George on LinkedIn: https://www.linkedin.com/in/george-pearse-b7a76a157/?originalSubdomain=uk Connect with Byron on LinkedIn: https://www.linkedin.com/in/byronaallen/ Timestamps: [00:00] Introduction to Byron Allen [01:10] Introduction to the new co-host George Pearse [01:41] ML Flow vs Kubeflow [05:40] George's take on ML Flow and Kubeflow [07:28] Writing in YAML [09:47] Developer experience [13:38] Changes in ML Flow and Kubeflow [17:58] Messing around ML Flow Serving [20:00] A taste of Kubeflow through K-Serve [23:18] Managed service of Kubeflow [25:15] How George used Kubeflow [27:45] Getting the Managed Service [31:30] Getting Authentication [32:41] ML Flow docs vs Kubeflow docs [36:59] Kubeflow community incentives [42:25] MLOps Search term [42:52] Organizational problem [43:50] Final thoughts on ML Flow and Kubeflow [49:19] Bonus [49:35] Entity-Centric Modeling [52:11] Semantic Layer options [57:27] Semantic Layer with Machine Learning [58:40] Satellite Infra Images demo [1:00:49] Motivation to move away from SQL [1:03:00] Managing SQL [1:05:24] Wrap up
undefined
Jul 11, 2022 • 59min

Why and When to Use Kubeflow for MLOps // Ryan Russon // Coffee Sessions #107

MLOps Coffee Sessions #107 with Ryan Russon, Manager, MLOps and Data Science of Maven Wave Partners, Why and When to Use Kubeflow for MLOps co-hosted by Mihail Eric.   // Abstract Kubeflow is an excellent platform if your team is already leveraging Kubernetes and allows for a truly collaborative experience. Let’s take a deep dive into the pros and cons of using Kubeflow in your MLOps.   // Bio From serving as an officer in the US Navy to Consulting for some of America's largest corporations, Ryan has found his passion in the enablement of Data Science workloads for companies and teams.      Having spent years as a data scientist, Ryan understands the types of challenges that DS teams face in scaling, tracking, and efficiently running their workloads.   // MLOps Jobs board   https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://www.mavenwave.com/ https://go.mlops.community/hFApDb --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Ryan on LinkedIn: https://www.linkedin.com/in/ryanrusson/ Timestamps: [00:00] Introduction to Ryan Russon [01:13] Takeaways [04:17] Bullish on KubeFlow! [06:23] KubeFlow in ML tooling [11:47] Kubeflow having its velocity [14:16] To Kubeflow or not to Kubeflow [18:25] KubeFlow ecosystem maturity [20:51] Alternatively starting from scratch? [23:11] Argo workflow vs KubeFlow pipelines [25:08] KubeFlow as an end-state for citizen data scientists [28:24] End-to-end workflow key players   [31:17] K-serve [33:41] KubeFlow on orchestrators [36:24] Natural transition to KubeFlow maturity [41:33] "Don't forget about the engineer cost." [42:21] KubeFlow to other "Flow brothers" trade-offs [46:12] Biggest MLOps challenge [49:52] Best practices around file structure [52:15] KubeFlow changes over the years and what to expect moving forward [55:52] Best-of-breed vision [57:54] Wrap up
undefined
Jul 5, 2022 • 54min

Building a Culture of Experimentation to Speed Up Data-Driven Value // Delina Ivanova // MLOps Coffee Sessions #106

MLOps Coffee Sessions #106 with Delina Ivanova, Associate Director, Data of HelloFresh, Building a Culture of Experimentation to Speed Up Data-Driven Value co-hosted by Vishnu Rachakonda. // Abstract Supply chain/manufacturing are prime areas where the use of data science/analytics/ ML is underdeveloped, and experimentation is required to collect data and enable data-driven solutions. This talk encourages companies to conduct experiments and collect data over time in order to build accurate/scalable data-driven solutions. // Bio Delina has over 10 years of experience across data and analytics, consulting, and strategy with roles spanning financial services, public sector, and CPG industries. She is currently the Associate Director, Data & Insights at HelloFresh Canada where she leads a full-service data team, including data engineering, data science, and business intelligence and automation. She is also a Data Science and Machine Learning instructor in the professional development programs at the University of Toronto and the University of Waterloo. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links The Discourses of Epictetus book: https://www.amazon.com/Discourses-Epictetus/dp/1537427180 The Pyramid Principle: Logic in Writing and Thinking book by Barbara Minto: https://www.amazon.com/Pyramid-Principle-Logic-Writing-Thinking/dp/0273710516 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Delina on LinkedIn: https://www.linkedin.com/in/delina-ivanova/ Timestamps: [00:00] Introduction to Delina Ivanova [00:35] Takeaways [03:46] Looking for People to organize local Meetups! [04:30] Delina's career trajectories and growth to the corporate schema [10:02] Telling stories with data [13:23] Tricks for being a translator from the business side to data teams [15:32] Technical engineering management and Delina's day-to-day role [20:40] Giving up day-to-day individual contributing work and coding [23:33] Good leadership for technical work [31:05] Growing team growing productivity [32:55] Pressured to grow [35:23] HelloFresh [39:39] Challenges of e-commerce, CPG, Logistics, and grocery combined [41:08] Cultural differences [46:04] Rapid fire session [52:20] Wrap up
undefined
Jul 1, 2022 • 1h 6min

Cleanlab: Labeled Datasets that Correct Themselves Automatically // Curtis Northcutt // MLOps Coffee Sessions #105

In this episode, Curtis Northcutt, CEO & Co-Founder of Cleanlab, discusses the importance of data-centric AI and the challenges of addressing noisy data. They also delve into the journey of Cleanlab in improving data labeling accuracy, the success of the startup in finding and correcting bad data, and the frustrations of bug smashing. Additionally, they explore the challenges of understanding the value and capabilities of AI tools and companies, as well as the hiring opportunities in DevRel and front-end engineering.
undefined
Jun 24, 2022 • 52min

MLOps + BI? // Maxime Beauchemin // MLOps Coffee Sessions #104

MLOps Coffee Sessions #104 with the creator of Apache Airflow and Apache Superset Maxime Beauchemin, Future of BI co-hosted by Vishnu Rachakonda. // Abstract // Bio Maxime Beauchemin is the founder and CEO of Preset. Original creator of Apache Superset.  Max has worked at the leading edge of data and analytics his entire career, helping shape the discipline in influential roles at data-dependent companies like Yahoo!, Lyft, Airbnb, Facebook, and Ubisoft. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.rungalileo.io/ Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney: https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Max on LinkedIn: https://www.linkedin.com/in/maximebeauchemin/ Timestamps: [00:00] Introduction to Maxime Beauchemin [01:28] Takeaways [03:42] Paradigm of data warehouse [06:38] Entity-centric data modeling [11:33] Metadata for metadata [14:24] Problem of data organization for a rapidly scaling organization [18:36] Machine Learning tooling as a subset or of its own [22:28] Airflow: The unsung hero of the data scientists [27:15] Analyzing Airflow [30:44] Disrupting the field [34:45] Solutions to the ladder problem of empowering exploratory work and mortals superpowers with data [38:04] What to watch out for when building for data scientists   [41:47] Rapid fire questions [51:12] Wrap up

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode