MLOps.community

Demetrios

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

Episodes

Mentioned books

Nov 19, 2020 • 52min

When Machine Learning meets privacy - Episode 3 with Charles Radclyffe

**AI and ethical dilemmas** Artificial Intelligence is seen by many as a vehicle for great transformation, but for others, it still remains a mystery, and many questions remain unanswered: will AI systems rule us one day? Can we trust AI to rule our criminal systems? Maybe create political campaigns and dominate political advertisements? Or maybe something less harmful, do our laundry? Some of these questions may sound absurd, but they are for sure making people shift from thinking purely about functional AI capabilities but also to look further to the ethics behind creating such powerful solutions. For this episode we count with Charles Radclyffe as a guest, the data philosopher, to cover some of these dilemmas. You can reach out to Charles through LinkedIn or at ethicsgrade.io Useful links: - MLOps.Community slack - TEDx talk - Surviving the Robot Revolution - Digital Ethics whitepaper

Nov 16, 2020 • 59min

UN Global Platform // Mark Craddock // Co-Founder & CTO, Global Certification and Training Ltd // MLOps Meetup #42

MLOps community meetup #42! Last Wednesday, we talked to Mark Craddock, Co-Founder & CTO, Global Certification and Training Ltd (GCATI), about UN Global Platform. // Abstract: Building a global big data platform for the UN. Streaming 600,000,000+ records / day into the platform. The strategy developed using Wardley Maps and the Platform Design Toolkit. // Bio: Mark contributed to the Cloud First policy for the UK Public sector and was one of the founding architects for the UK Governments G-Cloud programme. Mark developed the initial CloudStore which enabled the UK Public Sector to procure cloud services from over 2,500 suppliers. The UK Public Sector has now purchased over £6.3Bn of cloud services, with £3.6Bn from Small to Medium Enterprises in the UK. Mark lead the development of the United Nations Global Platform. A multi-cloud platform for capacity building within the national statistics offices in the use of Big Data and its integration with administrative sources, geospatial information, traditional survey and census data. Mark is now building a non-profit training and certification organization. ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mark on LinkedIn: https://www.linkedin.com/in/markcraddock/ Timestamps: [0:00] - Intro to Mark Craddock [03:35] - Mark's background [05:05] - UN Global Platform [05:18] - Vision: A global collaboration to harness the power of data for better lives [05:37] - UN GWG (Big) Data Membership [05:49] - Sustainable Development Goals [06:21] - Using the platform [06:30] - Approach [06:44] - Principles [07:29] - How big was the team who put this together? [08:09] - Leave no one behind. Endeavour to reach the furthest behind first. [08:24] - Platform Business Model [10:06] - Six distinct aspects of a platform and its ecosystem [10:46] - The platform is the only business model able to orchestrate the wide range of products and services in an ecosystem [11:09] - Through the means of a platform organization, ecosystems are capable of providing an improbable combination of attributes [11:55] - Platforms and business models are also one of the best organizational structures for enabling rapid evolution [13:22] - Technology Strategy [13:23] - Wardley Maps [14:50] - Is this were Machine Learning tools would fit in? [20:35] - Are you looking how fast these are moving across to the right? How can you gauge that? [26:57] - Is the value fluid? [28:43] - How did you factor in the different personas? [30:34] - How do you enable loosely coupled teams? [35:44] - Data also moves from left to right [42:00] - Technology Strategy Handbook [42:20] - Achievements - July '19 [42:31] - Global Billing Intelligence [43:15] - Privacy-Preserving Techniques Handbook [43:26] - Cryptographic Techniques [44:12] - Global Big Datasets [44:55] - Big Data [47:41] - Automatic Identification System (AIS) [48:14] - Automatic Dependent Surveillance (ADS-B) [48:41] - Satellite Imagery [49:11] - Services in the platform [49:16] - Location Analytics Service [50:06] - Stack Sample [50:37] - Data Sources [51:50] - NiFi Dataflow [52:20] - Is this how you enabled reproducibility? [53:47] - Location Analytics Service [55:31] - Shanghai - Flights [55:45] - Shanghai - Cargo Ships [56:00] - UN Global Platform

Nov 12, 2020 • 36min

When Machine Learning meets Data Privacy - Episode 2 with Cat Coode

What are regulations saying about data privacy? We are already aware of the importance of using Machine Learning to improve businesses, nevertheless to feed Machine Learning, data is a must, and in many cases, this data might even be considered sensitive information. So, does this mean that with new privacy regulations, access to data will be more and more difficult? ML and Data Science have their days counted? Or Will Machine beat privacy? To answer all these questions I’ve invited Cat Coode, an expert on Data Privacy regulations, to join me in this episode, and help us sort out these questions! Don’t forget to subscribe to the Mlops.community slack and if you’re looking for privacy-preserving solutions, show us some love and give a star to the Synthetic data open-source repo (https://github.com/ydataai/ydata-synthetic) Useful links: For more on Cat's work, you can have a look at catcoode.com or connect through LinkedIn. Original Privacy by design definition: https://www.ipc.on.ca/wp-content/uploads/resources/7foundationalprinciples.pdf

Nov 10, 2020 • 1h 1min

When You Say Data Scientist Do You Mean Data Engineer? Lessons Learned From Start Up Life // Elizabeth Chabot

In this episode, we talked to Elizabeth Chabot, Consultant at Deloitte, about When You Say Data Scientist Do You Mean Data Engineer? Lessons Learned From StartUp Life. // Key takeaways: If you have a data product that you want to function in production, you need MLOps Education needs to happen about the data product life cycle, noting that ML is just part of the equation Titles need to be defined to help outside users understand the differences in roles // Abstract: ML and AI may sound sexy to investors, but if you work in the field you've probably spent late nights reviewing outputs manually, poured over logs and ran root cause analyses until your eyes hurt. If you've created data products at a company where analytics and data science held no meaning before your arrival, you've probably spent many-a-late-night explaining the basics of data collection, why ETL cannot be half-baked and that when you create a supervised model it needs to be supervised. Companies hoping to create a data product can have a data scientist show them how ML/AI can further their product, help them scale, or create better recommendations than their competitors. What companies are not always aware of is once the algorithm is created the data scientist is usually handicapped until more data-hires are made to build the necessary pipelines and frontend to put the algorithm in production. With the number of unique data-titles growing each year, how should the first data-evangelist-wrangler-wizard navigate title assignment? // Bio: Elizabeth is a researcher turned data nerd. With a background in social and clinical sciences, Elizabeth is focused on developing data solutions that focus on creating value adds while allowing the user to make more intelligent decisions. ----------- Connect With Us ✌️------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/

Nov 10, 2020 • 1h

Metaflow: Supercharging Our Data Scientist Productivity // Ravi Kiran Chirravuri // MLOps Meetup #41

MLOps community meetup #41! Last Wednesday was an exciting episode that some attendees couldn't help to ask when is the next season of their favorite series! The conversation was around Metaflow: Supercharging Data Scientist Productivity with none other than Netflix’s very own Ravi Kiran Chirravuri. // Abstract: Netflix's unique culture affords its data scientists an extraordinary amount of freedom. They are expected to build, deploy, and operate large machine learning workflows autonomously without the need to be significantly experienced with systems or data engineering. Metaflow, our ML framework (now open-source at metaflow.org), provides them with delightful abstractions to manage their project's lifecycle end-to-end, leveraging the strengths of the cloud: elastic compute and high-throughput storage. In this talk, we preface with our experience working alongside data scientists, present our human-centric design principles when building Machine Learning Infrastructure, and showcase how you can adopt these yourself with ease with open-source Metaflow. // Bio: Ravi is an individual contributor to the Machine Learning Infrastructure (MLI) team at Netflix. With almost a decade of industry experience, he has been building large-scale systems focusing on performance, simplified user journeys, and intuitive APIs in MLI and previously Search Indexing and Tensorflow at Google. ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Ravi on LinkedIn: https://www.linkedin.com/in/seeravikiran/ Timestamps: [00:00] - Introduction to Ravi Kiran Chirravuri [02:21] - Ravi's background [05:19] - Metaflow: Supercharging Data Scientist Productivity [05:31] - Why do we have to build Metaflow? [06:14] - Infographic of a very simplified view of a machine learning workflow [07:01] - "An idea is typically meaningless without execution." [07:38] - Scheduling [08:14] - Life is great! [08:24] - Life happens and things are crashing and burning! [09:04] - What is Metaflow? [12:01] - How much data scientist cares [12:25] - How infrastructure is needed [13:03] - What Metaflow does [13:44] - How can you go about using Metaflow for your data science needs? [14:20] - People love DAG's [16:00] - Baseline [16:16] - Architecture [17:28] - Syntax [19:00] - Vertical Scalability [21:10] - Horizontal Scalability [22:59] - Failures are a feature [23:57] - State Transfer and Persistence [27:05] - Dependencies [30:57] - Model Ops: Versioning [33:19] - Monitoring in Notebooks [35:16] - Decouple Orchestration [36:48] - AWS Step Functions [37:16] - Export to AWS Step Functions [38:10] - From Prototype to Production and Back [42:07] - What are the prerequisites to use Metaflow? [43:32] - Where does Metaflow store everything? [45:10] - Are there any tutorials available? [45:22] - Have the tutorials been updated? [47:27] - How do you deploy Metaflow? [49:02] - Do you see Metaflow becoming a tool to develop and support auto ML. [50:34] - What were some of the biggest learnings that you saw people doing that they're not doing on Netflix? [52:19] - Does Metaflow exist to help data scientists to orchestrate everything? [54:30] - What do you version?

Nov 9, 2020 • 47min

Luigi in Production // MLOps Coffee Sessions #18 // Luigi Patruno ML in Production

Coffee Sessions #18 with Luigi Patruno of ML in Production, a Centralized Repository of Best Practices Summary Luigi Patruno and ML in production MLOps workflow: Knowledge sharing and best practices Objective: learn! Links: ML in production: https://mlinproduction.com/ Why you start MLinProduction: https://mlinproduction.com/why-i-started-mlinproduction/ Luigi Patruno: a man whose goal is to help data scientists, ML engineers, and AI product managers, build and operate machine learning systems in production. Luigi shares with us why he started ML in Production - A lot irrelevant content; a lot of clickbait with low standards of quality. He had an Entrepreneurial itch and The solution was to start a weekly newsletter. From there he started creating Blog posts and now teamed up with Sam Charrington of TWIML to create courses on SagMaker ML. Applied ML Best practices Reading google and microsoft papers Analyzing the tools that are out there ie sagemaker and how to the see the world? Aimed at making you more effective and efficient at your job Community questions Taking some time to answer some community questions! Who do you learn from? Favorite resources? Self-taught, papers, talks Construct the systems Uber michelangelo ----------------- 📝 Rought notes 📝 ---------------- Any companies that stand out to you in terms of MLOps excellence? Google, Amazon, Stichfix: they've had to solve hard problems Serving ads Personalization at scale Vertical problems: within their vertices Motivated by real challenges DropBox Great articles A great machine learning company Tools Sagemaker Has a course on sagemaker Nice lessons baked into the system Dos and don’t of MLOps DO LOG! Monitor Automate - manual analysis leads to problems Do it manually first til you feel confident that you can automate it Tag, version Store your training, val, and test sets! What is his process of identifying use cases that are suitable for machine learning as a solution? How do they proceed methodically? Start with business goal Potential number of users that the solution can benefit The ability to build a predictive model Performance x impact = score Rank problems by this How developed are the datasets? What part of the ML in Production process do people underestimate the most? What are the low hanging fruits that many people don’t take advantage of? Generate actual value without needing to build the most complex model possible In industry, performance is only one part of the equation How has he seen ML in production evolve over the last few years and where does he think it's headed next? More and more tools! Industry-specific tool taking advantage of ML Problem is you must have industry knowledge --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/

Nov 5, 2020 • 19min

When Machine Learning meets Data Privacy

This is the first episode of a podcast series on Machine Learning and Data privacy. Machine Learning is the key to the new revolution in many industries. Nevertheless, ML does not exist without data and a lot of it, which in many cases results in the use of sensitive information. With new privacy regulations, access to data is today harder and much more difficult but, does that mean that ML and Data Science has its days counted? Will the Machines beat privacy? Don’t forget to subscribe to the mlops.community slack (https://go.mlops.community/slack) and to give a star to the Synthetic data open-source repo (https://github.com/ydataai/ydata-synt...) Useful links: Medium post with the podcast transcription - https://medium.com/@fabiana_clemente/... In case you’re curious about GDPR fines - enforcementtracker.com The Netflix Prize - https://www.nytimes.com/2010/03/13/technology/13netflix.html Tensorflow privacy - https://github.com/tensorflow/privacy

Nov 3, 2020 • 1h 1min

Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17

MLOps level 2: CI/CD pipeline automation For a rapid and reliable update of the pipelines in production, you need a robust automated CI/CD system. This automated CI/CD system lets your data scientists rapidly explore new ideas around feature engineering, model architecture, and hyperparameters. They can implement these ideas and automatically build, test, and deploy the new pipeline components to the target environment. Figure 4. CI/CD and automated ML pipeline. This MLOps setup includes the following components: Source control Test and build services Deployment services Model registry Feature store ML metadata store ML pipeline orchestrator Characteristics of stages discussion. Figure 5. Stages of the CI/CD automated ML pipeline. The pipeline consists of the following stages: Development and experimentation: You iteratively try out new ML algorithms and new modelling where the experiment steps are orchestrated. The output of this stage is the source code of the ML pipeline steps that are then pushed to a source repository. Pipeline continuous integration: You build source code and run various tests. The outputs of this stage are pipeline components (packages, executables, and artefacts) to be deployed in a later stage. Pipeline continuous delivery: You deploy the artefacts produced by the CI stage to the target environment. The output of this stage is a deployed pipeline with the new implementation of the model. Automated triggering: The pipeline is automatically executed in production based on a schedule or in response to a trigger. The output of this stage is a trained model that is pushed to the model registry. Model continuous delivery: You serve the trained model as a prediction service for the predictions. The output of this stage is a deployed model prediction service. Monitoring: You collect statistics on the model performance based on live data. The output of this stage is a trigger to execute the pipeline or to execute a new experiment cycle. The data analysis step is still a manual process for data scientists before the pipeline starts a new iteration of the experiment. The model analysis step is also a manual process. Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/

Oct 30, 2020 • 58min

Hands-on serving models using KFserving // Theofilos Papapanagiotou // Data Science Architect at Prosus // MLOps Meetup #40

MLOps community meetup #40! Last Wednesday, we talked to Theofilos Papapanagiotou, Data Science Architect at Prosus, about Hands-on Serving Models Using KFserving. // Abstract: We looked to some popular model formats like the SavedModel of Tensorflow, the Model Archiver of PyTorch, pickle&ONNX, to understand how the weights of the NN are saved there, the graph, and the signature concepts. We discussed the relevant resources of the deployment stack of Istio (the Ingress gateway, the sidecar and the virtual service) and Knative (the service and revisions), as well as Kubeflow and KFServing. Then we got into the design details of KFServing, its custom resources, the controller and webhooks, the logging, and configuration. We spent a large part in the monitoring stack, the metrics of the servable (memory footprint, latency, number of requests), as well as the model metrics like the graph, init/restore latencies, the optimizations, and the runtime metrics which end up to Prometheus. We looked at the inference payload and prediction logging to observe drifts and trigger the retraining of the pipeline. Finally, a few words about the awesome community and the roadmap of the project on multi-model serving and inference routing graph. // Bio: Theo is a recovering Unix Engineer with 20 years of work experience in Telcos, on internet services, video delivery, and cybersecurity. He is also a university student for life; BSc in CS 1999, MSc in Data Coms 2008, and MSc in AI 2017. Nowadays he calls himself an ML Engineer, as he expresses through this role his passion for System Engineering and Machine Learning. His analytical thinking is driven by curiosity and hacker spirit. He has skills that span a variety of different areas: Statistics, Programming, Databases, Distributed Systems, and Visualization. ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Theofilos on LinkedIn: https://linkedin.com/in/theofpa

Oct 27, 2020 • 57min

Operationalize Open Source Models with SAS Open Model Manager // Ivan Nardini // Customer Engineer at SAS // MLOps Meetup #39

MLOps community meetup #39! Last week we talked to Ivan Nardini, Customer Engineer at SAS, about Operationalize Open Source Models with SAS Open Model Manager. // Abstract: Analytics are Open. According to their nature, Open Source technologies allows an agile development of the models, but it results difficult to put them in production. The goal of SAS is supporting customers in operationalize analytics In this meetup, I present SAS Open Model Manager, a containerized Modelops tool that accelerates deployment processes and, once in production, allows monitoring your models (SAS and Open Source). // Bio: As a member of Pre-Sales CI & Analytics Support Team, I'm specialized in ModelOps and Decisioning. I've been involved in operationalizing analytics using different Open Source technologies in a variety of industries. My focus is on providing solutions to deploy, monitor and govern models in production and optimize business decisions processes. To reach this goal, I work with software technologies (SAS Viya platform, Container, CI/CD tools) and Cloud (AWS). //Other Links you can check Ivan on: https://medium.com/@ivannardini ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Ivan on LinkedIn: https://www.linkedin.com/in/ivan-nardiniDescription Timestamps: 0:00 - Intro to Ivan Nardini 3:41 - Operationalize Open Source Models with SAS Open Model Manager slide 4:21 - Agenda 5:01 - What is ModelOps and what is the difference between MLOps and ModelOps? 6:19 - "Do I look like an expert?" Ivan's Background 7:12 - Why ModelOps? 7:20 - Operationalizing Analytics 8:12 - Operationalizing Analytics: SAS 9:08 - Operationalizing Analytics: Customer 11:36 - What's a model for you? 12:07 - Hidden Complexity in ML Systems 12:52 - Hidden Complexity in ML Systems: Business Prospective 14:12 - Hidden Complexity in ML Systems: IT Prospective 17:12 - One of the hardest things is Security? 17:52 - Hidden Complexity in ML Systems: Analytics Prospective 19:20 - Why ModelOps? 20:09 - ModelOps technologies Map 22:29 - Customers ModelOps Maturity over Technology Propensity. MLOps Maturity vs. Technology Propensity 26:23 - Show us your Analytical Models 26:56 - SAS can support you to ship them in production providing Governance and Decisioning. 27:28 - When you talk to people, is there something that you feel like there is a unified model, but focusing on the wrong thing? 29:14 - Have you seen Reproducibility and Governance? 30:47 - Advertising Time 30:55 - Operationalize Open Source Models with SAS Open Model Manager 31:02 - ModelOps with SAS 32:06 - SAS Open Model Manager 33:18 - Demo 33:27 - SAS Model Ops Architecture - Classification Model 35:02 - Model Demo: Credit Scoring Business Application 50:20 - Take Homes 50:24 - Operationalize Analytics 50:32 - Model Lifecycle Effort Side 51:20 - Business Value Side 51:47 - Typical Analytics Operationalization Graph 52:18 - Analytics Operationalization with ModelOps Graph 53:18 - Is this for everybody?

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app