AWS re:Invent Special: Sagemaker with Ankur Mehrotra
Jan 9, 2024
auto_awesome
Ankur Mehrotra, Director and GM of Amazon SageMaker, discusses the evolution of AWS SageMaker and its wide usage in various verticals. They also explore new features like foundation model evaluations and smart load-aware routing, as well as the benefits and future developments of SageMaker.
SageMaker HyperPod improves training efficiency with checkpointing, self-healing clusters, and reduced resource wastage.
Foundation Model Evaluations in Amazon SageMaker assist users in selecting models based on accuracy, toxicity, and bias dimensions.
Deep dives
Introducing Amazon SageMaker and its Journey
Amazon SageMaker is an AWS service for end-to-end machine learning development. It was developed based on Amazon's two-decade-long experience in machine learning and the challenges they faced. SageMaker was built to reduce the burden on developers by providing tools purpose-built for each step of the machine learning lifecycle. It offers extensive integration and solves the problem of connecting different bespoke tools. Tens of thousands of AWS customers are currently using Amazon SageMaker for their machine learning needs.
SageMaker HyperPod: Efficient Training of Large Language Models
SageMaker HyperPod is a new capability that addresses some of the challenges faced in training large language models. It makes it easier to set up clusters of accelerators and efficiently distribute data and models across these clusters. HyperPod also allows for checkpointing to save progress during the training process and automatically monitors the health of the cluster. This feature provides a self-healing training cluster, reducing time and resource wastage. SageMaker HyperPod offers a zero-touch training experience and has shown a significant reduction in cost for training foundation models.
Foundation Model Evaluations for Model Selection
To assist customers in choosing the right model for their use case, Amazon SageMaker has launched Foundation Model Evaluations. This feature allows users to evaluate models based on various dimensions such as accuracy, toxicity, and bias. It generates a comprehensive model evaluation report by running evaluations with built-in or custom datasets. SageMaker also supports human evaluations for aspects that are hard to assess automatically. This capability helps users make informed decisions when selecting models for their applications.
Smart Load-Aware Routing with Efficient Model Deployment
Amazon SageMaker now offers smart load-aware routing for efficient model deployment. This feature optimizes resource utilization and reduces latency by dynamically allocating accelerators to multiple model types on the same instance. It automatically monitors the load on instances and routes inference requests to idle or faster instances, enhancing the snappiness of interactive applications. With this capability, SageMaker has achieved cost savings of 50% on average and reduced inference latency by 20%.
This episode of Software Engineering Daily is part of our on-site coverage of AWS re:Invent 2023, which took place from November 27th through December 1st in Las Vegas.
In today’s interview, host Jordi Mon Companys speaks with Ankur Mehrotra who is the Director and GM of Amazon SageMaker.
Jordi Mon Companys is a product manager and marketer that specializes in software delivery, developer experience, cloud native and open source. He has developed his career at companies like GitLab, Weaveworks, Harness and other platform and devtool providers. His interests range from software supply chain security to open source innovation. You can reach out to him on Twitter at @jordimonpmm.