#600: Amazon SageMaker Multi Model Endpoints

FromAWS Podcast

Start listening View podcast show

#600: Amazon SageMaker Multi Model Endpoints

FromAWS Podcast

ratings:

Length:

20 minutes

Released:

Jul 3, 2023

Format:

Podcast episode

Description

Amazon SageMaker Multi-Model Endpoint (MME) is fully managed capability of SageMaker Inference that allows customers to deploy thousands of models on a single endpoint and save costs by sharing instances on which the endpoints run across all the models. Until recently, MME was only supported for machine learning (ML) models which run on CPU instances. Now, customers can use MME to deploy thousands of ML models on GPU based instances as well, and potentially save costs by 90%. MME dynamically loads and unloads models from GPU memory based on incoming traffic to the endpoint. Customers save cost with MME as the GPU instances are shared by thousands of models. Customers can run ML models from multiple ML frameworks including PyTorch, TensorFlow, XGBoost, and ONNX. Customers can get started by using the NVIDIA Triton™ Inference Server and deploy models on SageMaker’s GPU instances in “multi-model“ mode. Once the MME is created, customers specify the ML model from which they want to obtain inference while invoking the endpoint. Multi Model Endpoints for GPU is available in all AWS regions where Amazon SageMaker is available.
To learn more checkout:
Our launch blog: https://go.aws/3NwtJyh
Amazon SageMaker website: https://go.aws/44uCdNr

Released:

Jul 3, 2023

Format:

Podcast episode

Titles in the series (100)

The AWS Podcast is the definitive cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you.

Skip carousel

More Episodes from AWS Podcast

Skip carousel

Related podcast episodes

Skip carousel

Discover this podcast and so much more

#600: Amazon SageMaker Multi Model Endpoints

#600: Amazon SageMaker Multi Model Endpoints

Description

Titles in the series (100)

More Episodes from AWS Podcast

Related podcast episodes