

MLOps.community
Demetrios
Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)
Episodes
Mentioned books

Oct 26, 2020 • 57min
Machine in Production = Data Engineering + ML + Software Engineering // Satish Chandra Gupta // MLOps Coffee Sessions #16
//Bio
Satish built compilers, profilers, IDEs, and other dev tools for over a decade. At Microsoft Research, he saw his colleagues solving hard program analysis problems using Machine Learning. That is when he got curious and started learning. His approach to ML is influenced by his software engineering background of building things for production. He has a keen interest in doing ML in production, which is a lot more than training and tuning the models. The first step is to understand the product and business context, then building an efficient pipeline, then training models, and finally monitoring its efficacy and impact on the business. He considers ML as another tool in the software engineering toolbox, albeit a very powerful one. He is a co-founder of Slang Labs, a Voice Assistant as a Service platform for building in-app voice assistants.
//Talk Takeaways ML-driven product features will grow manifold. Organizations take an evolutionary approach to absorb tech innovations. ML will be no exception. How Organizations adopted cloud can offer useful lessons.
ML/DS folks who invest in an understanding business context and tech environment of the org will make a bigger impact.
Organizations that invest in data infrastructure will be more successful in extracting value from machine learning.
//Other links you can check Satish on
An Engineer’s trek into Machine Learning:
https://scgupta.link/ml-intro-for-developers
Architecture for High-Throughput Low-Latency Big Data Pipeline on Cloud:
https://scgupta.link/big-data-pipeline-architecture
Data pipeline article:
https://scgupta.link/big-data-pipeline-architecture or
https://towardsdatascience.com/scalable-efficient-big-data-analytics-machine-learning-pipeline-architecture-on-cloud-4d59efc092b5
Tips for software engineers based on my experience of getting into ML:
https://scgupta.link/ml-intro-for-developers or https://towardsdatascience.com/software-engineers-trek-into-machine-learning-46b45895d9e0
Linkedin:
https://www.linkedin.com/in/scgupta
Twitter:
https://twitter.com/scgupta
Personal Website:
http://scgupta.me
Company Website:
https://slanglabs.in
Voice Assistants info:
https://www.slanglabs.in/voice-assistants
Timestamps:
0:00 - Intro to Satish Chandra Gupta
1:05 - Background of Satish on Machine Learning
3:29 - Satish's background on what he's doing now
5:34 - Why were you interested in the challenges of the workload?
9:53 - As you're looking at the data pipeline, do you see much overlap there?
15:38 - Relationships between engineering pipeline characteristics and how they relate to data.
20:24 - Tips for saving when you're building these pipeline.
24:44 - First point of engagement: Collection
31:26 - Possibilities of Data Architecture
38:03 - Why is it beneficial to save money?
44:22 - Learnings of Satish with his current project, Voice Assistant as a service.

Oct 20, 2020 • 1h 2min
MLOps + Machine Learning // James Sutton // MLOps Coffee Sessions #15
James Sutton is an ML Engineer focused on helping enterprise bridge the gap between what they have now, and where they need to be to enable production scale ML deployments.
----------- Connect With Us ✌️-------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with James on LinkedIn: https://www.linkedin.com/in/jamessutton2/
Timestamps:
0:00 - Intro to Speaker
2:20 - Scope of the coffee session
3:10 - Background of James Sutton
8:28 - One-shots Classifier Algorithm
12:46 - Why is it a challenge from the engineering perspective with deployment?
19:20 - How to overcome bottlenecks?
30:07 - Vision of your landscape?
34:45 - Maturity playout
38:48 - Maturity perspective of ML
41:49 - Risk of overgeneralizing system designs patterns
46:10 - Reliability, Speed, Cost
46:46 - Consistency, Availability, Partition Tolerance (CAP Theorem)
47:36 - How do you go about discussing these tradeoffs with your clients?
51: 23 - How would you deal with the PII?
58:50 - Collaborative process with clients
1:00:55 - Wrap up

Oct 19, 2020 • 57min
Scalable Python for Everyone, Everywhere // Matthew Rocklin // MLOps Meetup #38
Parallel Computing with Dask and Coiled
Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world.
This talk discusses parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems.
Finally, we also addressed the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud.
Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations.
Matthew has given talks at a variety of technical, academic, and industry conferences. A list of talks and keynotes is available at (https://matthewrocklin.com/talks).
Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.
Check out our posts here to get more context around where we're coming from:
https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4
https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8
----------- Connect With Us ✌️-------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/

Oct 18, 2020 • 1h 1min
MLOps Coffee Sessions #13 How to Choose the Right Machine Learning Tool: A Conversation // Jose Navarro and Mariya Davydova
This time we talked about one of the most vibrant questions for any MLOps practitioner: how to choose the right tools for your ML team, given the huge amount of open-source and proprietary MLOps tools available on the market today.
We discussed several criteria to rely on when choosing a tool, including:
- The requirements of the particular team use-cases
- The scaling capacity of the tool
- The cost of migration from a chosen tool
- The cost of teaching the team to use this tool
- The company or the community behind the tool
Apart from that, we talked about particular use-cases and discussed the trade-offs between waiting for a new release of your tool to get the missing piece of functionality, switching to another tool, and building an in-house solution.
We also touched the topic of organising MLOps teams and practices across large companies with a lot of ML teams.
// Bio:
Jose Navarro
Jose Navarro is a Machine Learning Infrastructure Engineer making everyday cooking fun at Cookpad, where its recipe platform has more than 40 million monthly users. He holds a MSc in Machine Learning and High-Performance Computing from the University of Bristol. He is interested in Cloud Native technologies, serverless, and event-driven architecture.
Mariya Davidova
Mariya came to MLOps from a software development background. She started her career as a Java developer in JetBrains in 2011, then gradually moved to developer advocacy for JS-based APIs. In 2019, she joined Neu.ro as a platform developer advocate and then moved to the product management position.
Mariya has been obsessed with AI and ML for many years: she finished a bunch of courses, read a lot of books, and even wrote a couple of fiction stories about AI. She believes that proper tooling and decent development and operations practices are essential success component for ML projects, as well as they are for traditional SD.
----------- Connect With Us ✌️-------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Jose on LinkedIn: https://www.linkedin.com/in/jose-navarro-2a57b612/
Connect with Maria on LinedIn: https://www.linkedin.com/in/mariya-davydova/

4 snips
Oct 12, 2020 • 57min
MLOps Coffee Sessions #14 Conversation with the Creators of Dask // Hugo Bowne-Anderson and Matthew Rocklin
Hugo Bowne-Anderson and Matthew Rocklin, co-founders of Coiled, are reshaping the data science landscape. They dive into Dask, the open-source library that optimizes parallel computing for Python, making it easier to handle large datasets. The duo discusses the challenges of scaling data science, navigating cloud complexities, and the vital role of data literacy in organizations. They also share insights on community engagement in open source, the evolution of OSS, and the advantages of Dask over tools like Spark, emphasizing its future in distributed computing.

Oct 10, 2020 • 1h 5min
MLOps Coffee Sessions #12: Journey of Flyte at Lyft and Through Open-source // Ketan Umare
Ketan Umare, a Senior Staff Software Engineer at Lyft, discusses his pivotal role in the development of Flyte, a pivotal open-source project for machine learning infrastructure. He explains why Flyte was created, highlighting its capacity to handle tens of thousands of workflows and millions of tasks. The conversation delves into the complexities of mapping technology and the algorithmic challenges in ride-sharing. Ketan also shares insights on open-source community engagement and the transition to using Go for backend development.

Oct 4, 2020 • 1h 6min
MLOps Coffee Sessions #11: Analyzing “Continuous Delivery and Automation Pipelines in ML" // Part 3
Round 3 analyzing the Google paper "Continuous Delivery and Automation Pipelines in ML"
// Show Notes
Data Science Steps for ML
Data extraction: You select and integrate the relevant data from various data sources for the ML task.
Data analysis: You perform exploratory data analysis (EDA) to understand the available data for building the ML model. This process leads to the following:
Understanding the data schema and characteristics that are expected by the model.
Identifying the data preparation and feature engineering that are needed for the model.
Data preparation: The data is prepared for the ML task. This preparation involves data cleaning, where you split the data into training, validation, and test sets. You also apply data transformations and feature engineering to the model that solves the target task. The output of this steps are the data splits in the prepared format.
Model training: The data scientist implements different algorithms with the prepared data to train various ML models. In addition, you subject the implemented algorithms to hyperparameter tuning to get the best performing ML model. The output of this step is a trained model.
Model evaluation: The model is evaluated on a holdout test set to evaluate the model quality. The output of this step is a set of metrics to assess the quality of the model.
Model validation: The model is confirmed to be adequate for deployment—that its predictive performance is better than a certain baseline.
Model serving: The validated model is deployed to a target environment to serve predictions. This deployment can be one of the following:
Microservices with a REST API to serve online predictions.
An embedded model to an edge or mobile device.
Part of a batch prediction system.
Model monitoring: The model predictive performance is monitored to potentially invoke a new iteration in the ML process.
The level of automation of these steps defines the maturity of the ML process, which reflects the velocity of training new models given new data or training new models given new implementations. The following sections describe three levels of MLOps, starting from the most common level, which involves no automation, up to automating both ML and CI/CD pipelines.
In the rest of the conversation, we talk about maturity levels 0 and 1. Next session we will talk about Level 2.
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/

Oct 4, 2020 • 56min
MLOps Meetup #36: Moving Deep Learning from Research to Prod Using DeterminedAI and Kubeflow // David Hershey, DeterminedAI
MLOps community meetup #36! This week we talk to David Hershey Solutions Engineer at Determined AI, about Moving Deep Learning from Research to Production with Determined and Kubeflow.
// Key takeaways:
What components are needed to do inference in ML
How to structure models for ML inference
How a model registry helps organize your models for easy consumption
How you can set up reusable and easy-to-upgrade inference pipelines
// Abstract:
Translating the research that goes into creating a great deep learning model into a production application is a mess without the right tools. ML models have a lot of moving pieces, and on top of that models are constantly evolving as new data arrives or the model is tweaked. In this talk, we'll show how you can find order in that chaos by using the Determined Model Registry along with Kubeflow Pipelines.
// Bio:
David Hershey is a solutions engineer for Determined AI. David has a passion for machine learning infrastructure, in particular systems that enable data scientists to spend more time innovating and changing the world with ML. Previously, David worked at Ford Motor Company as an ML Engineer where he led the development of Ford's ML platform. He received his MS in Computer Science from Stanford University, where he focused on Artificial Intelligence and Machine Learning.
// Relevant Links
www.determined.ai
https://github.com/determined-ai/determined
https://determined.ai/blog/production-training-pipelines-with-determined-and-kubeflow/
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
https://www.linkedin.com/in/david-hershey-458ab081/
Timestamps:
0:00 - Intros
4:15 - The structure of the chat
5:20 - What is DeterminedAI?
7:20 - How is DeterminedAI different than other more standard artifact storage solutions?
9:25 - Where are the boundaries between what your tool determined AI does really well, and where it works smoothly with other things around it?
11:48 - Is Kubeflow dying?
13:54 - How do you see DeterminedAI and Kubeflow becoming more solidified?
15:55 - How does DeterminedAI interact with Kubeflow at the moment?
18:01 - What type of models they are, is the Kubeflow metadata?
19:18 - What a model registry is and why it's so important to have that?
23:16 - Can you give us the quick demo real fast?
30:52 - Which orchestration tool to use?
32:04 - When using Kubeflow are determined how can you deploy the model through CD tools like Jenkins?
33:40 - How is determined connected to Kubeflow?
36:09 - What components you feel are needed to do inference in machine learning? And how can we structure different models for that machine learning inference?
40:04 - Are they the same one when we talk about ML researchers?
42:14 - How can we better be ready for when we do want to get into the production?
44:59 - In this pipeline, Where do you normally see people getting stopped?
47:05 - What are things that you've been seen pop up that you're not necessarily thinking about in those first phases?
50:17 - What are the most underrated topic regarding deploying machine learning models in production?
52:44 - How do you see the adoption of tools such as Determined and Kubeflow by Data scientists?
54:40 - Can you explain the Determined open source components?

Sep 22, 2020 • 1h 8min
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
Second installation David and Demetrios reviewing the google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are:
Automatically retraining and serving the models:
When to do it?
Outlier detection
Drift detection
Outlier detection:
What is it?
How you deal with it
Drift detection
Individual features may start to drift. This could be a bug or it could be perfectly normal behavior that indicates that the world has changed requiring the model to be retrained.
Example changes:
shifts in people’s preferences
marketing campaigns
competitor moves
the weather
the news cycle
Locations
Time
Devices (clients)
If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project.
Deeper dive into concept drift
Feature/target distributions change
An overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift.”
https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdf
https://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdf
Types of concept drift:
Sudden
Gradual
Google in some way is trying to address this concern - the world is changing and you want your ML system to change as well so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains.
Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximize the velocity software delivery and its components: maintainability, extensibility, and testability
Then the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry.
This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured to your business domain - new data or drop in performance from the prod model.
The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated and how was it prepared? What was the training configuration and how was it evaluated? Etc. metrics are key here! Lineage tracking!!!
Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models beings served - hard to debug
Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/

Sep 17, 2020 • 53min
MLOps Meetup #34: Streaming Machine Learning with Apache Kafka and Tiered Storage // Kai Waehner, Confluent
MLOps Meetup #34! This week we talk to Kai Waehner about the beast that is apache kafka and how many different ways you can use it!
// Key takeaways:
-Kafka is much more than just messaging
-Kafka is the de facto standard for processing huge volumes of data at scale in real-time
-Kafka and Machine Learning are complementary for various use cases (including data integration, data processing, model training, model scoring, and monitoring)
// Abstract:
The combination of Apache Kafka, tiered storage, and machine learning frameworks such as TensorFlow enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem and Confluent Platform. This discussion features a predictive maintenance use case within a connected car infrastructure, but the discussed components and architecture are helpful in any industry.
// Bio:
Kai Waehner is a Technology Evangelist at Confluent. He works with customers across the globe and with internal teams like engineering and marketing. Kai’s main area of expertise lies within the fields of Big Data Analytics, Machine Learning, Hybrid Cloud Architectures, Event Stream Processing and Internet of Things. He is a regular speaker at international conferences such as Devoxx, ApacheCon and Kafka Summit, writes articles for professional journals, and shares his experiences with new technologies on his blog: www.kai-waehner.de.
Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Kai: contact@kai-waehner.de / @KaiWaehner / LinkedIn (https://www.linkedin.com/in/megachucky/)
________Show Notes_______
Blogpost tierd storage
https://www.confluent.io/blog/streaming-machine-learning-with-tiered-storage/
https://www.confluent.io/resources/kafka-summit-2020/apache-kafka-tiered-storage-and-tensorflow-for-streaming-machine-learning-without-a-data-lake/
Blogpost about using kafka as a database
https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/
Example repo on github
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Model serving vs embedded kafka
https://www.confluent.io/blog/machine-learning-real-time-analytics-models-in-kafka-applications/
https://www.confluent.io/kafka-summit-san-francisco-2019/event-driven-model-serving-stream-processing-vs-rpc-with-kafka-and-tensorflow/
Istio blog post
https://www.kai-waehner.de/blog/2019/09/24/cloud-native-apache-kafka-kubernetes-envoy-istio-linkerd-service-mesh/