Engineering

Blue/Green Worker Node Deployment – Kubernetes, EKS and Terraform

Blue/Green Worker Node Deployment with Kubernetes, EKS and Terraform

Here at Lumo, we use Kubernetes for our API, supporting services and auxiliary cronjobs for our data pipeline. Before EKS was a thing, we deployed our clusters using kubespray and Terraform. While this combination got the job done, the code became expansive and changes to the nodes or controlplane were not easy to make and took forever to apply. So, we decided to check out EKS and use blue/green deployments to manage changes to our nodes. See this repo for some code examples referenced throughout this post.

Enter EKS – AWS’s answer to GKE and AKS. I have to say that I still think EKS has a lot of issues to iron out and features to support, including allowing a Kubernetes version other than 1.10.3! All that aside, our environments have not experienced any downtime related to a flaky controlplane (:crossesfingers:) – API latency is slightly higher than our kubespray deployments but that’s to be expected given how the API traffic now transits. EKS (with the help of Heptio Ark) has also allowed us to cut our Kubernetes cluster deployments from around 1.5 hours to under 20 minutes (it takes about 10 minutes for an EKS cluster to initialize) – this includes restoring all resources with Ark and having full availability. Most of this is done using the terraform-aws-eks module.

So now that we had an EKS cluster and a group of workers attached to an autoscaling group, some other questions come into play – like how should we patch them now? We could run Ansible on a schedule, run a patching playbook manually, or build a new AMI with baked in updates and release it every few weeks. In the end, the AMI made the most sense because workers would be able to scale in and out of the cluster without needing any additional provisioning. However, introducing this into the flow of things meant that we needed to find a zero-downtime way of updating our workers.

Fortunately, the terraform-aws-eks module makes this pretty simple as long as some other critical steps are sprinkled in. Here’s an outline of what needs to be done to transition to a worker group with an updated AMI:

  1. Create new worker group with updated AMI
  2. Wait for them to join the cluster – takes about 30s to build them and another 30s or so for them to be ready
  3. Assuming your Load Balancers are already aware of the autoscaling groups created by the terraform-aws-eks module, make sure the new workers are attached to your LBs before proceeding, or you will be in for a rude awakening when you transition pods in the next step!
  4. Drain the old nodes to transition pods slowly over to the new nodes – do one node at a time!
  5. After verifying all the pods have been moved to the right nodes, scale the old worker autoscaling group to zero

Creating a new worker group

Creating an additional worker group is very simple with the terraform-aws-eks module. All you need to do is simply add an additional map to your worker_groups list variable telling the module what attributes it should use to create the new workers. In this case, all you need to do is update the AMI (unless you want other attributes to change – like the instance size, etc).

For a blue/green configuration, our code ends up looking something like this. That code is what your terraform will look like AFTER an old worker group is decommissioned. To use this deployment method for new workers, you will always have two maps in the worker_group list, thus you will always see two autoscaling groups in EC2 (although only one will be scaled up).

Scaling up your new workers

After you’ve modified your terraform to deploy two worker groups and applied the changes, you should now see two autoscaling groups in EC2 that are both scaled to your desired_capacity. Use kubectl get no to make sure all new workers are recognized by the cluster and reporting Ready

Check your load balancers!

Depending on how your load balancers are managed, the new workers may or may not be attached to your load balancers at this point, so that is the first thing you want to double check during the first few runs of this method to make sure that nothing gets left out when you apply this to production. We use terraform to manage our load balancers (not Kubernetes), so at this point we run terraform apply again to attach the new autoscaling group to the appropriate load balancers. All of our services use the NodePort type, for what it’s worth.

At this point you will have both worker groups in your load balancers. Once the new nodes are reporting healthy, we are ready to move onto pod eviction.

Drain the old nodes to transition pods, slowly!

Now that all the new workers are up and healthy, we can start draining the old nodes. We can get a list of all nodes sorted by creation date using kubectl get no --sort-by=.metadata.creationTimestamp. Without getting into node labels, we can use the AGE of the nodes to use as a filter for the drain one-liner.

NAME                                         STATUS         ROLES     AGE       VERSION
ip-10-99-60-148.us-west-2.compute.internal   Ready                    22h       v1.10.3
ip-10-99-61-156.us-west-2.compute.internal   Ready                    22h       v1.10.3
ip-10-99-61-237.us-west-2.compute.internal   Ready                    22h       v1.10.3
ip-10-99-62-114.us-west-2.compute.internal   Ready                    22h       v1.10.3
ip-10-99-62-175.us-west-2.compute.internal   Ready                    22h       v1.10.3
ip-10-99-60-224.us-west-2.compute.internal   Ready                    11m       v1.10.3
ip-10-99-60-7.us-west-2.compute.internal     Ready                    11m       v1.10.3
ip-10-99-61-132.us-west-2.compute.internal   Ready                    11m       v1.10.3
ip-10-99-61-5.us-west-2.compute.internal     Ready                    11m       v1.10.3
ip-10-99-62-13.us-west-2.compute.internal    Ready                    11m       v1.10.3

But, before you run a script to drain all the nodes, drain just one and make sure the evicted pods come up ok on the new nodes. After you’ve verified that the pods function well on the new node(s), we run this script to drain the old nodes.

Scale down the old autoscaling group

All your pods should now be transitioned over to the new workers! As long as nothing went terribly wrong, you’re ready to scale down the old autoscaling group. Your worker_groups var should end up looking something like this:

    map(
      "name", "k8s-worker-blue",
      "ami_id", "ami-179fc16f",
      "asg_desired_capacity", "0",
      "asg_max_size", "0",
      "asg_min_size", "0",
      "instance_type","m4.xlarge",
      "key_name", "${aws_key_pair.infra-deployer.key_name}",
      "root_volume_size", "48"
    ),
    map(
      "name", "k8s-worker-green",
      "ami_id", "ami-67a0841f",
      "asg_desired_capacity", "5",
      "asg_max_size", "8",
      "asg_min_size", "5",
      "instance_type","m4.xlarge",
      "key_name", "${aws_key_pair.infra-deployer.key_name}",
      "root_volume_size", "48"
    )

The blue workers have been scaled to zero and green is now active. Now when you go to use a new AMI, blue will be scaled up with it and green will be scaled to zero. Credit to @brandoconnor on GitHub for helping me figure out the logic here.

Contact us

Message Received


  • Lumo Flights API (Developers)
  • Lumo Navigator (Travel Managers)
  • Help / support
  • Providing feedback
  • Other

  • Travel management company
  • Corporate travel manager
  • Travel tech provider
  • Airline
  • Airport
  • Insurance
  • Personal traveler/passenger
  • Other


shares