Episode 046 - Computer Vision on AWS with Francesco Pochetti – Part 2
Jul 29, 2022
auto_awesome
Francesco Pochetti, Senior Machine Language Engineer at Bolt, dives deep into ML tools on AWS, such as NVIDIA Triton and TensorRT, improving processing time for Computer Vision. He also covers Amazon SageMaker, deploying ML models using Docker, and the importance of understanding the business problem when developing a machine learning model.
Using image segmentation models in AWS SageMaker with TryTron and TensorRT drastically reduces latency and improves the naturalness of face blurring in photos.
The four main steps of a machine learning pipeline are understanding the business problem, exploratory data analysis, modeling, and production, and utilizing tools like AWS SageMaker, Clarify, and SageMaker experiments is crucial for success.
Deep dives
Creating an Instagram-like filter for blurring faces in photos
The podcast episode discusses the process of creating an Instagram-like filter to automatically blur faces in photos of crowds. The idea came from a tweet by the CEO of Hugging Face, and the host decided to tackle the challenge by experimenting with off-the-shelf solutions for face detection and image processing. However, these solutions resulted in unnatural and rectangular blurs. To achieve a more natural effect, the host explored using a model for image segmentation, which classifies each pixel as either part of the face or the background. Training the model was easy with AWS SageMaker, and deployment became simpler with the integration of TryTron and TensorRT. The combined solution drastically reduced latency, making it 10 times faster. The progress in deployment and the use of Docker containers were also highlighted.
The four key steps in a machine learning pipeline
The podcast episode outlines the four main steps of a machine learning pipeline: understanding the business problem, exploratory data analysis (EDA), modeling, and production. Understanding the business problem is crucial for success, as it helps determine the metrics, baseline, and desired outcomes. EDA involves collecting and analyzing data to validate the problem assumptions and check for unexpected patterns. The modeling step includes training and experimenting with different algorithms and models to find the best solution. It is important to utilize tools like AWS SageMaker, Clarify, and SageMaker experiments during this phase. Lastly, production involves deploying the model, monitoring its performance, managing data drift, and maintaining the model's accuracy over time.
Challenges in machine learning production and the importance of the community
The podcast discusses the challenges faced in machine learning production, such as ensuring versioning of both code and data, monitoring inputs and outputs for data drift, and maintaining accuracy. However, the host emphasizes the excitement generated by the collaborative and supportive machine learning community. The community's willingness to share knowledge, contribute to open-source projects, and engage in discussions is what drives progress in the field. The host encourages others to join open-source projects to enhance their skills and become better developers and individuals. The host's online presence can be found on Twitter, LinkedIn, and their personal blog, where they share valuable content related to machine learning and AWS.
In part two, Dave chats again with Francesco Pochetti, Senior Machine Language
Engineer at Bolt, and an AWS Machine Learning Hero. In this episode, Francesco dives deep in
the ML tools on AWS, starting with the tools such as NVIDIA Triton and TensorRT, and how to
improve processing time for Computer Vision. He also covers Amazon SageMaker, and many other
AWS ML services as well as deploying ML models using Docker in the best way possible. If you
missed it, you could listen to part one of this conversation in Episode 045.
Francesco on Twitter: https://twitter.com/Fra_Pochetti
Dave on Twitter: https://twitter.com/thedavedev
Francesco’s Website: https://francescopochetti.com/
Francesco’s LinkedIn: https://www.linkedin.com/in/francescopochetti/
Francesco’s GitHub: https://github.com/FraPochetti
[BLOG] Blurry faces: Training, Optimizing and Deploying a segmentation model on Amazon
SageMaker with NVIDIA TensorRT and NVIDIA Triton -
https://francescopochetti.com/blurry-faces-a-journey-from-training-a-segmentation-model-to-deploying-tensorrt-to-nvidia-triton-on-amazon-sagemaker/
[BLOG] Machine Learning and Developing inside a Docker Container in Visual Studio Code
https://francescopochetti.com/developing-inside-a-docker-container-in-visual-studio-code/
[BLOG] Deploying a Fashion-MNIST web app with Flask and Docker:
https://francescopochetti.com/deploying-a-fashion-mnist-web-app-with-flask-and-docker/
[BLOG] IceVision meets AWS: detect LaTeX symbols in handwritten math and deploy with Docker
on Lambda:
https://francescopochetti.com/icevision-meets-aws-detect-latex-symbols-in-handwritten-math-and-deploy-with-docker-on-lambda/
[DOCS] Amazon Rekognition - https://aws.amazon.com/rekognition/
[DOCS] Amazon SageMaker - https://aws.amazon.com/sagemaker/
[DOCS] Amazon Textract - https://aws.amazon.com/textract/
[DOCS] Deploy fast and scalable AI with NVIDIA Triton Inference Server in Amazon SageMaker
https://aws.amazon.com/blogs/machine-learning/deploy-fast-and-scalable-ai-with-nvidia-triton-inference-server-in-amazon-sagemaker/
[GIT] Nvidia Triton Inference Server:
https://github.com/triton-inference-server/server/
[GIT] Blurry faces: Training, Optimizing and Deploying a segmentation model on Amazon
SageMaker with NVIDIA TensorRT and NVIDIA Triton -
https://github.com/FraPochetti/KagglePlaygrounds/tree/master/triton_nvidia_blurry_faces
Subscribe:
Amazon Music:
https://music.amazon.com/podcasts/f8bf7630-2521-4b40-be90-c46a9222c159/aws-developers-podcast
Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-developers-podcast/id1574162669
Google Podcasts:
https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5zb3VuZGNsb3VkLmNvbS91c2Vycy9zb3VuZGNsb3VkOnVzZXJzOjk5NDM2MzU0OS9zb3VuZHMucnNz
Spotify:
https://open.spotify.com/show/7rQjgnBvuyr18K03tnEHBI
TuneIn:
https://tunein.com/podcasts/Technology-Podcasts/AWS-Developers-Podcast-p1461814/
RSS Feed:
https://feeds.soundcloud
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode