MLOps.community  cover image

MLOps.community

Latest episodes

undefined
Oct 4, 2020 • 56min

MLOps Meetup #36: Moving Deep Learning from Research to Prod Using DeterminedAI and Kubeflow // David Hershey, DeterminedAI

MLOps community meetup #36! This week we talk to David Hershey Solutions Engineer at Determined AI, about Moving Deep Learning from Research to Production with Determined and Kubeflow.  // Key takeaways: What components are needed to do inference in ML How to structure models for ML inference How a model registry helps organize your models for easy consumption How you can set up reusable and easy-to-upgrade inference pipelines // Abstract: Translating the research that goes into creating a great deep learning model into a production application is a mess without the right tools. ML models have a lot of moving pieces, and on top of that models are constantly evolving as new data arrives or the model is tweaked. In this talk, we'll show how you can find order in that chaos by using the Determined Model Registry along with Kubeflow Pipelines. // Bio: David Hershey is a solutions engineer for Determined AI. David has a passion for machine learning infrastructure, in particular systems that enable data scientists to spend more time innovating and changing the world with ML. Previously, David worked at Ford Motor Company as an ML Engineer where he led the development of Ford's ML platform. He received his MS in Computer Science from Stanford University, where he focused on Artificial Intelligence and Machine Learning. // Relevant Links www.determined.ai https://github.com/determined-ai/determined https://determined.ai/blog/production-training-pipelines-with-determined-and-kubeflow/  Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ https://www.linkedin.com/in/david-hershey-458ab081/ Timestamps: 0:00 - Intros 4:15 - The structure of the chat 5:20 - What is DeterminedAI? 7:20 - How is DeterminedAI different than other more standard artifact storage solutions? 9:25 - Where are the boundaries between what your tool determined AI does really well, and where it works smoothly with other things around it? 11:48 - Is Kubeflow dying? 13:54 - How do you see DeterminedAI and Kubeflow becoming more solidified? 15:55 - How does DeterminedAI interact with Kubeflow at the moment? 18:01 - What type of models they are, is the Kubeflow metadata? 19:18 - What a model registry is and why it's so important to have that? 23:16 - Can you give us the quick demo real fast? 30:52 - Which orchestration tool to use? 32:04 - When using Kubeflow are determined how can you deploy the model through CD tools like Jenkins? 33:40 - How is determined connected to Kubeflow? 36:09 - What components you feel are needed to do inference in machine learning? And how can we structure different models for that machine learning inference? 40:04 - Are they the same one when we talk about ML researchers? 42:14 - How can we better be ready for when we do want to get into the production? 44:59 - In this pipeline, Where do you normally see people getting stopped? 47:05 - What are things that you've been seen pop up that you're not necessarily thinking about in those first phases? 50:17 - What are the most underrated topic regarding deploying machine learning models in production?  52:44 - How do you see the adoption of tools such as Determined and Kubeflow by Data scientists? 54:40 - Can you explain the Determined open source components?
undefined
Sep 22, 2020 • 1h 8min

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

Second installation David and Demetrios reviewing the google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are: Automatically retraining and serving the models: When to do it? Outlier detection Drift detection Outlier detection: What is it? How you deal with it Drift detection Individual features may start to drift. This could be a bug or it could be perfectly normal behavior that indicates that the world has changed requiring the model to be retrained. Example changes: shifts in people’s preferences marketing campaigns competitor moves the weather the news cycle Locations Time Devices (clients) If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project. Deeper dive into concept drift Feature/target distributions change An overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift.” https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdf https://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdf Types of concept drift: Sudden Gradual Google in some way is trying to address this concern - the world is changing and you want your ML system to change as well so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains. Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximize the velocity software delivery and its components: maintainability, extensibility, and testability Then the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry. This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured to your business domain - new data or drop in performance from the prod model. The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated and how was it prepared? What was the training configuration and how was it evaluated? Etc. metrics are key here! Lineage tracking!!! Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models beings served - hard to debug Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/
undefined
Sep 17, 2020 • 53min

MLOps Meetup #34: Streaming Machine Learning with Apache Kafka and Tiered Storage // Kai Waehner, Confluent

MLOps Meetup #34! This week we talk to Kai Waehner about the beast that is apache kafka and how many different ways you can use it! // Key takeaways: -Kafka is much more than just messaging -Kafka is the de facto standard for processing huge volumes of data at scale in real-time -Kafka and Machine Learning are complementary for various use cases (including data integration, data processing, model training, model scoring, and monitoring) // Abstract: The combination of Apache Kafka, tiered storage, and machine learning frameworks such as TensorFlow enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem and Confluent Platform. This discussion features a predictive maintenance use case within a connected car infrastructure, but the discussed components and architecture are helpful in any industry. // Bio: Kai Waehner is a Technology Evangelist at Confluent. He works with customers across the globe and with internal teams like engineering and marketing. Kai’s main area of expertise lies within the fields of Big Data Analytics, Machine Learning, Hybrid Cloud Architectures, Event Stream Processing and Internet of Things. He is a regular speaker at international conferences such as Devoxx, ApacheCon and Kafka Summit, writes articles for professional journals, and shares his experiences with new technologies on his blog: www.kai-waehner.de.  Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Kai: contact@kai-waehner.de / @KaiWaehner / LinkedIn (https://www.linkedin.com/in/megachucky/) ________Show Notes_______ Blogpost tierd storage https://www.confluent.io/blog/streaming-machine-learning-with-tiered-storage/ https://www.confluent.io/resources/kafka-summit-2020/apache-kafka-tiered-storage-and-tensorflow-for-streaming-machine-learning-without-a-data-lake/ Blogpost about using kafka as a database  https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/ Example repo on github  https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference Model serving vs embedded kafka https://www.confluent.io/blog/machine-learning-real-time-analytics-models-in-kafka-applications/ https://www.confluent.io/kafka-summit-san-francisco-2019/event-driven-model-serving-stream-processing-vs-rpc-with-kafka-and-tensorflow/ Istio blog post https://www.kai-waehner.de/blog/2019/09/24/cloud-native-apache-kafka-kubernetes-envoy-istio-linkerd-service-mesh/
undefined
Sep 14, 2020 • 56min

MLOps Meetup #33 Owned By Statistics: How Kubeflow & MLOps Can Help Secure Your ML Workloads // David Aronchick - Head of Open Source ML Strategy at Azure

While machine learning is spreading like wildfire, very little attention has been paid to the ways that it can go wrong when moving from development to production. Even when models work perfectly, they can be attacked and/or degrade quickly if the data changes. Having a well understood MLOps process is necessary for ML security! Using Kubeflow, we demonstrated how to the common ways machine learning workflows go wrong, and how to mitigate them using MLOps pipelines to provide reproducibility, validation, versioning/tracking, and safe/compliant deployment. We also talked about the direction for MLOps as an industry, and how we can use it to move faster, with less risk, than ever before. David leads Open Source Machine Learning Strategy at Azure. This means he spends most of his time helping humans to convince machines to be smarter. He is only moderately successful at this. Previously, he led product management for Kubernetes on behalf of Google, launched Google Kubernetes Engine, and co-founded the Kubeflow project. He has also worked at Microsoft, Amazon and Chef and co-founded three startups. When not spending too much time in service of electrons, he can be found on a mountain (on skis), traveling the world (via restaurants) or participating in kid activities, of which there are a lot more than he remembers than when he was that age. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aronchick/
undefined
Sep 14, 2020 • 59min

MLOps Coffee Sessions #9 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning “ // Part 1

In this last episode, we covered how Google is thinking about MLOps and how automation plays a key part in their view of MLOps. We started to talk about CI, CD, and the role they play in a pipeline setup for CT. In the next episode, we'll pick up where we left off, starting our discussion of CT and some of the reasons you’d want to set up a pipeline with continuous training in the first place. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/
undefined
Sep 8, 2020 • 53min

MLOps Meetup #32 Building Say Less: An AI-Powered Summarization App // Yoav Zimmerman - Founder of Model Zoo

Yoav is the builder behind Say Less, an AI-powered email summarization tool that was recently featured on the front page of Hacker News and Product Hunt. In this talk, Yoav will walk us through the end-to-end process of building the tool, from the prototype phase to deploying the model as a realtime HTTP endpoint. Yoav Zimmerman is the engineer / founder behind Model Zoo, a machine learning deployment platform focused on ease-of-use. He has previously worked at Determined AI on large-scale deep learning training infrastructure and at Google on knowledge base construction for features that powered Google Assistant and Search.  Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/ Connect with Yoav on LinkedIn: https://www.linkedin.com/in/yoav-zimmerman-05653252/
undefined
Sep 8, 2020 • 58min

MLOps Coffee Sessions #8 // MLOps from the Perspective of an SRE // Neeran Gul

|| Links Referenced in the Show || General Info: https://medium.com/@paktek123 Load Balancer Series: https://medium.com/load-balancer-series Upcoming Open Src: https://medium.com/upcoming-open-source Some Libraries Neeran maintains: https://github.com/paktek123/elasticsearch-crystal Some libraries Neeran used to maintain: https://github.com/microsoft/pgtester (and https://medium.com/yammer-engineering/testing-postgresql-scripts-with-rspec-and-pg-tester-c3c6c1679aec) Some interesting projects Neeran has worked on (architected these): https://devblogs.microsoft.com/cse/2016/05/22/access-azure-blob-storage-from-your-apps-using-s3-api/, https://medium.com/yammer-engineering/logs-on-logs-on-logs-aggregation-at-yammer-2b7073f35606 Some of Benevolent Stuff: https://www.benevolent.com/engineering-blog/deploying-metallb-in-production, https://www.benevolent.com/engineering-blog/spark-on-kubernetes-for-nlp-at-scale (helped with the infra side) Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/
undefined
Sep 5, 2020 • 56min

MLOps Meetup #31 // Creating Beautiful Ambient Music with Google Brain’s Music Transformer // Daniel Jeffries - Chief Technology Evangelist at Pachyderm

We trained a Transformer neural net on ambient music to see if a machine can compose with the great masters. Ambient is a soft, flowing, ethereal genre of music that I’ve loved for decades. There are all kinds of ambient, from white noise, to tracks that mimic the murmur of soft summer rain in a sprawling forest, but Dan favors ambient that weaves together environmental sounds and dreamy, wavelike melodies into a single, lush tapestry.   Can machine learning ever hope to craft something so seemingly simple yet intricate? The answer is yes and it’s getting better and better with each passing year. It won’t be long before artists are co-composing with AI, using software that helps them weave their own masterpieces of sound. In this talk, we looked at how we did it. Along the way we’ll listen to some more awesome samples that worked really well and some that didn’t work as well as we hoped. You can download the model to play around with yourself. Dam also shows you an end-to-end machine learning pipeline, with downloadable containers that you can string together with ease to train a masterful music-making machine learning model on your own.  Dan Jeffries is Chief Technology Evangelist at Pachyderm. He’s also an author, engineer, futurist, pro blogger and he’s given talks all over the world on AI and cryptographic platforms. He’s spent more than two decades in IT as a consultant and at open source pioneer Red Hat. With more than 50K followers on Medium, his articles have held the number one writer's spot on Medium for Artificial Intelligence, Bitcoin, Cryptocurrency and Economics more than 25 times.  His breakout AI tutorial series "Learning AI If You Suck at Math" along with his explosive pieces on cryptocurrency, "Why Everyone Missed the Most Important Invention of the Last 500 Years” and "Why Everyone Missed the Most Mind-Blowing Feature of Cryptocurrency,” are shared hundreds of times daily all over social media and been read by more than 5 million people worldwide. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/ Connect with Dan on LinkedIn: https://www.linkedin.com/in/danjeffries/
undefined
Aug 31, 2020 • 56min

MLOps Coffee Sessions #7 // MLOps and DevOps - Parallels and Deviations // Featuring Damian Brady

MLOps and DevOps have a large number of parallels. Many of the techniques, practices, and processes used for traditional software projects can be followed almost exactly in ML projects. However, the day-to-day of an ML project is usually significantly different from a traditional software project. So while the ideas and principles can still apply, it’s important to be aware of the core aims of DevOps when applying them. Damian is a Cloud Advocate specializing in DevOps and MLOps. After spending a year in Toronto, Canada, he returned to Australia - the land of the dangerous creatures and beautiful beaches - in 2018. Formerly a dev at Octopus Deploy and a Microsoft MVP, he has a background in software development and consulting in a broad range of industries. In Australia, he co-organised the Brisbane .Net User Group, and launched the now annual DDD Brisbane conference. He regularly speaks at conferences, User Groups, and other events around the world. Most of the time you'll find him talking to software engineers, IT pros and managers to help them get the most out of their DevOps strategies. || Links Referenced in the Show || MLOps, or DevOps for Machine Learning: https://damianbrady.com.au/2019/10/28/mlops-or-devops-for-machine-learning/ Microsoft Azure Machine Learning: http://ml.azure.com/ MLOps Coffee Sessions #6 Continuous Integration for ML // Featuring Elle O'Brien: https://www.youtube.com/watch?v=L98VxJDHXMM MLOps: Isn’t that just DevOps? Ryan Dawson speaks at MLOps Coffee Session: https://www.seldon.io/mlops-isnt-that-just-devops-ryan-dawson-speaks-at-mlops-coffee-session/ DVC - Data Version Control: https://dvc.org/ Pachyderm - Version-controlled data science: https://www.pachyderm.com/ Databricks - Unified Data Analytics: https://databricks.com/ Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Damian on LinkedIn: https://www.linkedin.com/in/damianbrady/
undefined
Aug 20, 2020 • 57min

MLOps Meetup #30 // Path to Production and Monetizing Machine Learning // Vin Vashishta - Data Scientist | Strategist | Speaker & Author

The concept of machine learning products is a new one for the business world. There is a lack of clarity around key elements: Product Roadmaps and Planning, the Machine Learning Lifecycle, Project and Product Management, Release Management, and Maintenance. In this talk, we covered a framework specific to Machine Learning products. We discussed the improvements businesses can expect to see from a repeatable process. We also covered the concept of monetization and integrating machine learning into the business model.  Vin is an applied data scientist and teaches companies to monetize machine learning. He is currently working on a ML based decision support product as well as my strategy consulting practice. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vin on LinkedIn: https://www.linkedin.com/in/vineetvashishta/ Timestamps: [00:00] Intro to Vin Vashishta [01:33] Vin's background [05:04] Key problems when monetizing Machine Learning [07:00] How can we fix the key problems in monetizing Machine Learning [13:24] How can we go about creating that repeatable process? [16:17] There are all these data scientists aren't going to school and getting all these diplomas for data wrangling. Right? [17:12] How can you successfully envision that road mapping from the beginning of the process? [24:19] How can a Data Scientist be more proactive instead of just getting paid? [28:53] Have you figured out how to quickly estimate an order of magnitude when ROI questions arise? [31:48] Have you seen a company that has machine learning as its core product or have you seen some companies crash and burn? [34:39] How do you see the tooling ecosystem right now? And how do you see it in a few years? [38:24] And so how do you balance that when a lot of these tools have a lot of like, bleed and overlap? And so what does that look like? [42:40] Have you stumbled across organizations wanting to adopt AI without having the foundations such as data? [45:28] How can we convince human curators to do machine learning? [47:23] What are the three biggest challenges you've faced when monetizing the value of ML products. How did you overcome them? [50:25] How do you deal with people measuring costs and values?

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode