How to Perform a Rolling Deployment on AWS
We've gone through a few iterations of our rolling deployment strategy for the cloudtamer.io application, starting with Kubernetes and Kubernetes Operations (KOPS). We recently switched to a cloud-native approach and CloudFormation templates to allow us to deploy our software reliably in all AWS regions and in most VPCs.
In this article, I'll give you a quick tutorial on how to launch an application on AWS using CloudFormation templates and then perform updates to it using rolling deployments to prevent any downtime of your application. I've included a link to a working CloudFormation template at the end of this article that you can use as a reference.
What's Created by the CloudFormation Template
The CloudFormation template creates the following:
- User-facing elastic load balancer (ELB) to load balance traffic across the nodes and allow you to change the nodes behind the load balancer without changing the address your users use to access the application
- User-facing security group to provide users with access to the application by IP
- Auto-scaling group to create and destroy nodes based on the scaling policy
- Launch configuration to define the characteristics of the nodes created by the auto-scaling group
- Internal security group to provide the load balancer with access to the nodes for their health checks
- IAM role with an instance profile to provide the nodes with permission to access S3 (just for demonstration purposes)
How Does Rolling Deployment Work?
When the CloudFormation is first launched, it performs the following steps:
- Creates the user facing security group.
- Creates the IAM role with an instance profile.
- Creates the internal security group.
- Creates the load balancer.
- Creates the launch configuration.
- Creates the auto-scaling group.
When the auto-scaling group is created, it launches a specified number of nodes based on the user-supplied parameters in the CloudFormation template. The auto-scaling group has user data (bash script) that it passes to each of the nodes on boot. The script creates an `app` folder in the `/opt` directory, changes to the created `app` folder, updates the $PATH environment variable to include the the AWS tools, runs the `PostCommand` parameter defined in the CloudFormation template, and then calls the `cfn-signal` app. On first creation, the `cfn-signal` app is not required to finish the installation, but it is during an update. Once the node passes the health checks, it becomes "InService" on the load balancer. You can open your web browser to the URL of the load balancer to see the "hello world v1.0" message.
To do an app update, perform an UpdateStack operation on the CloudFormation template, change the text in `PostCommand` from "v1.0" to "v2.0", and then perform the update. The CloudFormation will perform the following steps:
- Updates the launch configuration with the new user data from the `PostCommand` parameter.
- Increases the auto-scaling group desired capacity by 1 so it creates an additional node.
- Waits for the node to use the cfn-signal app to send a "success" message.
- At this point, the node should be "InService" because it will have passed the health checks.
- Once the CloudFormation service receives the success message, it will then decrease the desired capacity by 1, which then deletes the old node.
The rolling deployment keeps at least one EC2 instance "InService" and waits for the new node to inform the CloudFormation service that it's setup before making any changes to existing nodes.
We've deployed our application in many different environments. Here are some recommendations based on what we've learned:
- Support various VPC configurations: Organizations already have a network architecture tailored to their needs so your application must allow the user to select the VPC and subnets during installation.
- Support custom AMIs: Organizations may require a custom AMI so you may need to support a custom block device mapping if you are changing the size of the boot device. This is why we've allowed the customization of the `Image Root Device` parameter in the CloudFormation template. You may still have to run a few commands to expand the root volume in the `PostCommand`. Here is an example that expands a root volume if using a custom AMI:
`growpart /dev/nvme0n1 2; pvresize /dev/nvme0n1p2; lvresize -r -L +95GB /dev/mapper/VolGroup00-varVol.`
- Support SSL out of the box: We've provided an optional `SSL Certificate` CloudFormation parameter that will attach an SSL certification from ACM to the load balancer. It's an optional field so, if you leave it blank, the application will only respond on 80 instead of 443.
- Support customer security groups: Most organizations perform scans of their AMIs so allow them to specify their own security groups to attach to the nodes for their security team.
- Support different load balancer types: The CloudFormation template allows you to specify either an internal or an internet-facing, user-facing load balancer. Many of our customers have VPCs without an Internet Gateway so you'll need to support an internal load balancer for this scenario, which is easy to configure.
- Clearly explain every parameter: CloudFormation allows you to set a description for each parameter. This helped us clearly articulate the purpose of each field and how to use it. You should add notes around using your application with configurations customers might have, such as dedicated tenancy VPCs.
Reference: AWS Rolling Deployment template
Joe leads engineering at cloudtamer.io.