Exploring the world of model serving in machine learning, discussing serverless concepts, API endpoints, streaming and batch data, with a sprinkle of coffee vs tea banter. They touch on real-time prediction scenarios, optimizing model serving using Kubeflow, and challenges of deploying models in production. Delve into the practical applications of Kubeflow, model training with the Iris dataset, building custom model services, and planning in-depth MLOps discussions with audience engagement.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Serving models with Kubeflow involves using live endpoints for real-time predictions, ensuring accessibility to end users.
Model serving through Kubeflow offers flexibility by abstracting serving frameworks and optimizing job creation for deployment.
Deep dives
Introduction to Coffee Break Sessions
Coffee Break Sessions is a new series where David and Demetrius explore deep dives into various topics, with David teaching Demetrius about different subjects. The goal is to engage with topics suggested by the audience or to have guests share their expertise.
Serving Models and Kubeflow
Serving models involves making trained machine learning models available for real-time predictions to end users through server endpoints. Kubeflow is used to serve models via live endpoints, ensuring predictions are accessible to users when needed.
Kubeflow Serving Frameworks Comparison
Kubeflow serves as an abstraction layer over serving frameworks like TensorFlow serving, XGBoost, and ONNX, optimizing job creation for various frameworks. This flexibility allows for serving different types of models, ensuring efficient deployment and scalability.
Essentials for Serving Models
For serving models, understanding Kubernetes API for cluster interaction, Docker containers for packaging code, and creating API interfaces are vital. Preparing the model to serve real-time predictions involves fundamental knowledge of containers, APIs, and pre/post-processing methods.
MLOps coffee sessions coming at you with our primer episode talking bout kfserving! David Aponte and Demetrios Brinkmann dive deep into what model serving is in machine learning, what different types of serving there is, what serverless means, API endpoints, streaming and batch data and a bit of coffee vs tea banter.
||Show Notes||
ML in Production is Hard Blog article by Nikki: http://veekaybee.github.io/2020/06/09/ml-in-prod/?utm_campaign=Data_Elixir&utm_source=Data_Elixir_289
Interactive learning platform Katacoda: https://www.katacoda.com/
Github repo used in video: https://github.com/aponte411/demos
Blog on different ways to handle model serving: http://bugra.github.io/posts/2020/5/25/how-to-serve-model/
Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw
Follow us on twitter:@mlopscommunity
Sign up for the next meetup: https://zoom.us/webinar/register/WN_a_nuYR1xT86TGIB2wp9B1g
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode