Orchestration using ECS and ECR - Part II

Elastic Container Service (ECS)

This is a AWS propreitary solution for container orchestration.
There are three key concepts to work with this solution:
- Service: Manages containers and relates them to EC2 machines as needed
- Task: Is a specific container
- Cluster: Is the environment of EC2 machines where containers live
The below diagram illustrates these relationships.

ecs

Source: https://aws.com/

We will set up a cluster and run a task/container and use a service to manage it. We will use Fargate for this exercise, although there are other more detailed ways to achieve the same end goals.
Using Fargate will hide a lot of complexity, especially with privisioning the underlying EC2 instances.
Lets start by getting to ECS.

ecs_create1 ecs_create1b

Hit the get started button above or navigate to the clusters link on the left and then hit the get started button below.

ecs_create2

Choose the custom container definition and hit configure.

ecs_create3

Name the container, and point it to the ECR image URI. Specify the port to be 80 (this is what we choose in recommend.py). Then hit update.

ecs_create4

We will keep the task definition to the default values.

ecs_create5

Similarly we will retain the defaults for the service definition on the next page as well.

ecs_create6

Name the cluster and then review the settings. Note that a lot is happening under the hood.

ecs_create7

Review the settings as shown below:

ecs_review1 ecs_review2

The cluster gets created and you can see the status for all tasks change to green in due time.

ecs_status1

Click view service and then click the Tasks tab at the center/bottom part of the screen.

ecs_status2

Clicking on the task (there should only be one task listed) will show the public IP through which we can seek model predicitons.

ecs_status3

Navigating to our browser and accessing the IP and making a query such as http://18.220.91.58/?uid=20 gives the following response.

ecs_status4

Setting Up a Load Balancer

Directly accessing the IP address (and thus the single container) is not scalable.
A Load Balancer will give a static URL and route incoming HTTP(S) requests to ECS tasks managed by an ECS service.
There are many types of load balancers on AWS, see here. We will use an application load balancer (ALB).
Using the ‘Get Started’ workflow on ECS is the easiest to set this up to work with the cluster.

alb_ecs

Otherwise, we can instantiate the same from the EC2 page.

alb_ec2

The load balancer uses a VPC (virtual private cloud, an AWS terminology and service), a security group (who can access our container) and a target group (who to route requests to).
Aside: VPC essentially isolates your computing environment from the external world.

Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. You can use both IPv4 and IPv6 in your VPC for secure and easy access to resources and applications.

The ECS service should use the load balancer as shown below (if we did it using ‘Get Started’ workflow on ECS).

alb_ecs1

We can also add it separately while creating the service (we will skip the details here).
You can access the public static url by navigating to the load balancer page.
First click on the load balancer on the service page shown above.
Then click on the load balancer link to the top right of the page.

alb_ecs2

The public URL is the DNS name (the IP itself will be dynamic). You can bind it to your domain (using CNAME records).

alb_ecs3

Recap: Summary for ECS

Since there are a lot of steps involved (as well as quite a few moving parts), its good to revisit what our original goal is.
Our goal was to set the model prediction/deployment up in such a way that it scales and has no issues with failures.
The ECS cluster is scalable (we can add more tasks and services easily).
Further, the ECS service manages these tasks such that even if the underlying EC2 instances that run these containers fail (could be any reason) the task can be restarted on other machines to keep everything running smoothly.
Finally the load balancer, maps a static external IP to the internal container(s). So if there are multiple model prediction containers, the load balancer will use an algorithm (such as round robin) to distribute the incoming requests.
While there is a lot more work to set this up unlike the serverless solution, there is more fine grained control and visibility into the components supporting your scalable model deployment effort.

Tear Down the ECS Cluster

Follow Step 7 onwards from this link: https://aws.amazon.com/getting-started/hands-on/deploy-docker-containers/
- Essentially update the service to ensure that the number of tasks is 0.
- Then delete the service and stop the task and delete the task and task definition.
- Then delete the cluster itself.
- Check on the EC2, VPC and Load balancing pages if you are still consuming resources.