Deploying Models to AWS SageMaker
Amazon launched AWS SageMaker, a machine learning service, in November 2017. Besides, it contains hundreds of built-in algorithms and pre-trained models from model hubs, including TensorFlow Hub, PyTorch Hub, HuggingFace, and MxNet GluonCV. Data Scientists are carrying out common machine learning tasks easily using SageMaker.
Generative AI has been catching up globally and is now the new focus of every business. Models need an advanced echo system. For example, a model needs the data for model training, model run time (AWS infra), model deployment (MLOps), and security at all levels.
MLOps plays an important role in the entire model lifecycle, like uniting data collection, preprocessing, modeling, evaluation, product deployment, and re-training into a unified process. MLOps evolved from DevOps principles and is particularly used for ML models’ lifecycles.
MLOps uses MLFlow, an API interface, to integrate MLOps principles into a model’s lifecycle with minimal changes to existing code. With just a few lines of code introduced at a few places, it can follow all the segments relevant to the model. With MLFlow, you can stay highly organized with your experiments even if you don’t necessarily need the deployment capabilities to the cloud services.
MLFlow is a Free and Open-Source Software (FOSS) from Data Bricks.
An organization must implement the operationalization of MLFlow models using AWS SageMaker. It includes a series of activities: upload runs to S3 storage, build & push an MLFlow Docker container image, deploy a model, query it, update it once it is deployed, and remove a deployed model.
Before running, MLFlow needs to fulfill a few AWS pre-requisites:
1. AWS CLI is set up and ready to use.
2. AWS IAM credentials are set up to use.
3. AWS IAM role set up for SageMaker to read/write from S3.
4. Docker run time is installed and ready to use.
The Model deployment requires:
· Configuring AWS: Setting up the S3 bucket to hold the mlruns in a folder. The SageMaker can refer to the mlruns in S3 using Step Functions or Lambdas to create a docker image and push it to ECR.
· Deploying a model to AWS SageMaker: The MLflow on SageMaker can pull the image from ECR and run it on the ECS. SageMaker maintains its own dedicated ECR and ECS.
· Making predictions: Now, the model runs in SageMaker.
· Switching models: Using MLFlow, we can switch models (either the same type of model, another version, or entirely a different model).
· Removing the deployed model: After it runs and delivers predictions, it can be shut down with an ML flow. It can delete the endpoints if created and helps to save OpEx costs.
In a headless mode, the model can point to read data from S3 (in CSV format, for example) and get the results posted in S3 as a CSV. It can also run with endpoints. In that case, we can use Boto3 to send a request and get a response.
A high-level MLOps on SageMaker conceptual architecture is depicted below from Amazon documentation for reference.
MLOps for SageMaker can be built by the steps here: https://aws.amazon.com/tutorials/machine-learning-tutorial-mlops-automate-ml-workflows/.
Conclusively, ML is and will be one of the essential tools for various businesses. Despite the scope of investment and improvement, training, maintaining, and developing ML models has been cumbersome and ad-hoc. MLOps is vital for organizations to govern and manage traditional or Gen AI models.
Other Good References:
· https://mlflow.org/docs/latest/python_api/mlflow.sagemaker.html
· https://github.com/aws-samples/sagemaker-studio-mlflow-integration