Orchestration using ECS and ECR - Part I

Intro

Orchestration means managing container life cycle from building them to deploying (which requires provisioning of appropriate compute resources, storage resources, networking resources), scaling, load-balancing and other tasks, while accounting for failures throughout.

While there are many orchestration solutions, we will focus on a couple of them: ECS by AWS and Kubernetes (local hosted solution and managed by GCP). While there is Elastic Kubernetes Service (EKS) by AWS as well, we will omit it here, as the ideas are the same.

Why should data science professions know such orchestration solutions?

  • Pro: Get key devops features (fault tolerance, scalability etc) with low on-going effort. The deployed service will not likely break down.
  • Con: There is a non-trivial setup cost/complexity.

Next, we will (a) deploy our model serving docker image to the AWS container registry ECR, (b) use ECS to deploy a container based on that image, and © set up a load balancer that mediates requests to the prediction model.

Elastic Container Registry (ECR)

  • Sending the image to a model registry such as ECR is needed to access it at other places to create containers.
  • ECR allows for better integration with the AWS platform, and works with EKS as well.
  • We will create a docker image locally (can also be done on EC2) and push it to an ECR repository that we will create.

  • Navigate to the ECR link on the AWS console.

ecr_create0

  • Click create a repository ‘Get Started’ button. ECR can have multiple repositories and each repository can hold multiple images.

ecr_create1

  • Give a name to the repository. It will contain multiple Docker images.

ecr_create2

  • Review the current repository list.

ecr_create3

  • Next we will allow the programmatic user we created for accessing S3 (we gave the user name model-user) to also manage the ECR repositories. Navigate to IAM as shown below.

iam_ecr1

  • Click on the user who we want to edit access controls for.

iam_ecr2

  • Currently this user had S3 access (not relevant for us).

iam_ecr3

  • Click on attaching existing policies and search for AmazonEC2ContainerRegistryFullAccess.

iam_ecr4

  • Review your policy choice and proceed.

iam_ecr5

  • You can see the summary of permissions that this programmatic user has.

iam_ecr6

Creating the Model Prediction Docker Image

  • We will use the following flask app that uses the pytorch model to serve recommendations:

    from surprise import Dataset
    import torch
    import pandas as pd
    import flask
    
    class MF(torch.nn.Module):
        
    def __init__(self, n_user, n_item, k=18, c_vector=1.0, c_bias=1.0):
        super(MF, self).__init__()
        self.k = k
        self.n_user = n_user
        self.n_item = n_item
        self.c_bias = c_bias
        self.c_vector = c_vector
            
        self.user = torch.nn.Embedding(n_user, k)
        self.item = torch.nn.Embedding(n_item, k)
            
        # We've added new terms here:
        self.bias_user = torch.nn.Embedding(n_user, 1)
        self.bias_item = torch.nn.Embedding(n_item, 1)
        self.bias = torch.nn.Parameter(torch.ones(1))
        
    def __call__(self, train_x):
        user_id = train_x[:, 0]
        item_id = train_x[:, 1]
        vector_user = self.user(user_id)
        vector_item = self.item(item_id)
            
        # Pull out biases
        bias_user = self.bias_user(user_id).squeeze()
        bias_item = self.bias_item(item_id).squeeze()
        biases = (self.bias + bias_user + bias_item)
            
        ui_interaction = torch.sum(vector_user * vector_item, dim=1)
            
        # Add bias prediction to the interaction prediction
        prediction = ui_interaction + biases
        return prediction
        
    def loss(self, prediction, target):
        loss_mse = F.mse_loss(prediction, target.squeeze())
            
        # Add new regularization to the biases
        prior_bias_user =  l2_regularize(self.bias_user.weight) * self.c_bias
        prior_bias_item = l2_regularize(self.bias_item.weight) * self.c_bias
            
        prior_user =  l2_regularize(self.user.weight) * self.c_vector
        prior_item = l2_regularize(self.item.weight) * self.c_vector
        total = loss_mse + prior_user + prior_item + prior_bias_user + prior_bias_item
        return total
    
    def get_top_n(model,testset,trainset,uid_input,n=10):
        
    preds = []
    try:
        uid_input = int(trainset.to_inner_uid(uid_input))
    except KeyError:
        return preds        
    
    # First map the predictions to each user.
    for uid, iid, _ in testset: #inefficient
        try:
            uid_internal = int(trainset.to_inner_uid(uid))
        except KeyError:
            continue
        if uid_internal==uid_input:
            try:
                iid_internal = int(trainset.to_inner_iid(iid))
                movie_name = df.loc[int(iid),'name']
                preds.append((iid,movie_name,float(model(torch.tensor([[uid_input,iid_internal]])))))
            except KeyError:
                pass
    # Then sort the predictions for each user and retrieve the k highest ones
    if preds is not None:
        preds.sort(key=lambda x: x[1], reverse=True)
        if len(preds) > n:
            preds = preds[:n]
    return preds
    
    
    app = flask.Flask(__name__)
    
    
    #Data
    df = pd.read_csv('./movies.dat',sep="::",header=None,engine='python')
    df.columns = ['iid','name','genre']
    df.set_index('iid',inplace=True)
    data = Dataset.load_builtin('ml-100k',prompt=False) 
    '''
    Exercise: remove the above dependency. 
    Currently it downloads data from grouplens website and stores in .surprise folder in $HOME
    '''
    trainset = data.build_full_trainset()
    testset = trainset.build_anti_testset()
    
    #Parameters that are needed to reload the model from disk
    k = 10 #latent dimension
    c_bias = 1e-6
    c_vector = 1e-6
    model = MF(trainset.n_users, trainset.n_items, k=k, c_bias=c_bias, c_vector=c_vector)
    model.load_state_dict(torch.load('./pytorch_model'))
    model.eval() #no need for gradient computations in this setting
    
    
    # define a predict function as an endpoint
    @app.route("/", methods=["GET"])
    def predict():
    data = {"success": False}
    
    # check for passed in parameters   
    params = flask.request.json
    if params is None:
        params = flask.request.args
        
    if "uid" in params.keys():
        data["response"] = get_top_n(model,testset,trainset,params['uid'],n=10) 
        data["success"] = True
            
    # return a response in json format 
    return flask.jsonify(data)
    
    
    # start the flask app, allow remote connections
    app.run(host='0.0.0.0', port=80)
    
  • The corresponding Dockerfile is below. The key additional files in addition to recommend.py above are:

    • movies.dat
    • pytorch_model

      FROM continuumio/miniconda3:latest
      
      RUN conda install -y flask pandas \
      && conda install -c conda-forge scikit-surprise \
      && conda install pytorch torchvision cpuonly -c pytorch 
      
      COPY recommend.py recommend.py
      COPY movies.dat movies.dat
      COPY pytorch_model pytorch_model
      
      ENTRYPOINT ["python","recommend.py"]
      
  • The miniconda image above is from https://hub.docker.com/r/continuumio/miniconda3.

  • Building an image based on the above file and running our prediction locally can be done using the following commands:

    docker image build -t "prediction_service" .
    docker run -d -p 5000:5000 prediction_service
    docker ps -a #check what all containers were/are running
    docker kill container_id #after checking that the service runs, we can safely stop and delete the container.
    docker rm container_id
    
  • If we run a container based on this image, the python file and others will be in the root (/) folder and will be run by the root user. While we will not improve this here, it is better to run services as non-root users.

Sending our Docker Image to ECR

  • We will follow the instruction here to push our image to the repository we just created.

  • Assuming you have the aws CLI configured with the secret keys, run the following command:

    aws ecr get-login-password --region region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com
    
  • Substitute region with us-east-1 etc (check the URL on the ECR page) as well as aws_account_id with the actual account id. We should get a prompt saying ‘Login Succeeded’.

  • Lets tag our image before sending it to ECR (replace account id and region below as well):

    (datasci-dev) ttmac:docker-prediction-service theja$ docker tag prediction_service aws_account_id.dkr.ecr.region.amazonaws.com/models:recommendations
    (datasci-dev) ttmac:docker-prediction-service theja$ docker images
    REPOSITORY                                            TAG                 IMAGE ID            CREATED             SIZE
    aws_account_id.dkr.ecr.region.amazonaws.com/recommendations   pytorch_model       364179b27eb1        21 minutes ago      2.06GB
    weather_service                                                latest              20d340f941c0        2 days ago          496MB
    debian                                                         buster-slim         c7346dd7f20e        5 weeks ago         69.2MB
    continuumio/miniconda3                                         latest              b4adc22212f1        6 months ago        429MB
    hello-world                                                    latest              bf756fb1ae65        8 months ago        13.3kB
    
  • Pushing to ECR is achieved by the following:

    docker push aws_account_id.dkr.region.amazonaws.com/recommendations:pytorch_model
    
  • You should see the update progress (this is a large upload!)

    The push refers to repository [aws_account_id.dkr.ecr.us-east-2.amazonaws.com/recommendations]
    a5649bbe3e5f: Pushed
    5c87fc4d582f: Pushed
    e1e8d92205bf: Pushed
    5c6c81390816: Pushing [=========================>                         ]  848.8MB/1.635GB
    fcd8d39597dd: Pushed
    875120aa853c: Pushed
    f2cb0ecef392: Pushed
    
  • And the push conclusion:

    (datasci-dev) ttmac:docker-prediction-service theja$ docker push aws_account_id.dkr.ecr.us-east-2.amazonaws.com/recommendations:pytorch_model
    The push refers to repository [aws_account_id.dkr.ecr.us-east-2.amazonaws.com/recommendations]
    a5649bbe3e5f: Pushed
    5c87fc4d582f: Pushed
    e1e8d92205bf: Pushed
    5c6c81390816: Pushed
    fcd8d39597dd: Pushed
    875120aa853c: Pushed
    f2cb0ecef392: Pushed
    pytorch_model: digest: sha256:af5dfaf227cd96c4ca8ca952c511fb4274c59d76574726462137bc7c4230be07 size: 1793
    
  • On the ECR page, if we look at the images in the recommendations repository, it will contain our recently uploaded image.

ecr_image1

  • There is a friendly help box that details specific (to your account) commands for pushing images to ECR on the above page as well. You could use that as a guiding reference, or the help page.

ecr_push