Setting up a Docker Swarm cluster on AWS EC2

What we have set up is essentially a blank slate. AWS has a long list of offerings that could be deployed to the VPC that we’ve created. What we’re looking to do in this section is to set up a single EC2 instance to install Docker, and set up a single-node Docker Swarm cluster. We’ll use this to familiarize ourselves with Docker Swarm. In the remainder of the chapter, we’ll build more servers to create a larger swarm cluster for full deployment of Notes.

A Docker Swarm cluster is simply a group of servers running Docker that have been joined together into a common pool. The code for the Docker Swarm orchestrator is bundled with the Docker Engine server but it is disabled by default. To create a swarm, we simply enable swarm mode by running docker swarm init and then run a docker swarm join command on each system we want to be part of the cluster. From there, the Docker Swarm code automatically takes care of a long list of tasks. The features for Docker Swarm include the following:

  • Horizontal scaling: When deploying a Docker service to a swarm, you tell it the desired number of instances as well as the memory and CPU requirements. The swarm takes that and computes the best distribution of tasks to nodes in the swarm.
  • Maintaining the desired state: From the services deployed to a swarm, the swarm calculates the desired state of the system and tracks its current actual state. Suppose one of the nodes crashes—the swarm will then readjust the running tasks to replace the ones that vaporized because of the crashed server.
  • Multi-host networking: The overlay network driver automatically distributes network connections across the network of machines in the swarm.
  • Secure by default: Swarm mode uses strong Transport Layer Security (TLS) encryption for all communication between nodes.
  • Rolling updates: You can deploy an update to a service in such a manner where the swarm intelligently brings down existing service containers, replacing them with updated newer containers.

We will use this section to not only learn how to set up a Docker Swarm but to also learn something about how Docker orchestration works.

To get started, we’ll set up a single-node swarm on a single EC2 instance in order to learn some basics, before we move on to deploying a multi-node swarm and deploying the full Notes stack.

1. Deploying a single-node Docker Swarm on a single EC2 instance

For a quick introduction to Docker Swarm, let’s start by installing Docker on a single EC2 node. We can kick the tires by trying a few commands and exploring the resulting system.

This will involve deploying Ubuntu 20.04 on an EC2 instance, configuring it to have the latest Docker Engine, and initializing swarm mode.

1.1. Adding an EC2 instance and configuring Docker

To launch an EC2 instance, we must first select which operating system to install. There are thousands of operating system configurations available. Each of these configurations is identified by an AMI code, where AMI stands for Amazon Machine Image.

To find your desired AMI, navigate to the EC2 dashboard on the AWS console. Then, click on the Launch Instance button, which starts a wizard-like interface to launch an instance. You can, if you like, go through the whole wizard since that is one way to learn about EC2 instances. We can search the AMIs via the first page of that wizard, where there is a search box.

For this exercise, we will use Ubuntu 20.04, so enter Ubuntu and then scroll down to find the correct version, as illustrated in the following screenshot:

This is what the desired entry looks like. The AMI code starts with ami- and we see one version for x86 CPUs, and another for ARM (previously Advanced RISC Machine). ARM processors, by the way, are not just for your cell phone but are also used in servers. There is no need to launch an EC2 instance from here since we will instead do so with Terraform.

Another attribute to select is the instance size. AWS supports a long list of sizes that relate to the amount of memory, CPU cores, and disk space. For a chart of the available instance types, click on the Select button to proceed to the second page of the wizard, which shows a table of instance types and their attributes. For this exercise, we will use the t2.micro instance type because it is eligible for the free tier.

Create a file named ec2-public.tf containing the following:

resource “aws_instance” “public” {

ami = var.ami_id

instance_type = var.instance_type

subnet_id = aws_subnet.public1.id

key_name = var.key_pair

vpc_security_group_ids = [ aws_security_group.ec2-public-sg.id ]

associate_public_ip_address = true

tags = {

Name = “${var.project_name}-ec2-public”

}

depends_on = [ aws_vpc.notes, aws_internet_gateway.igw ]

user_data = join(“\n”, [

“#!/bin/sh”, file(“sh/docker_install.sh”), “docker swarm init”,

“sudo hostname ${var.project_name}-public”

])

}

In the Terraform AWS provider, the resource name for EC2 instances is aws_instance. Since this instance is attached to our public subnet, we’ll call it aws_instance.public. Because it is a public EC2 instance, the associate_public_ip_address attribute is set to true.

The attributes include the AMI ID, the instance type, the ID for the subnet, and more. The key_name attribute refers to the name of an SSH key we’ll use to log in to the EC2 instance. We’ll discuss these key pairs later. The vpc_security_group_ids attribute is a reference to a security group we’ll apply to the EC2 instance. The depends_on attribute causes Terraform to wait for the creation of the resources named in the array. The user_data attribute is a shell script that is executed inside the instance once it is created.

For the AMI, instance type, and key-pair data, add these entries to variables.tf, as follows:

variable “ami_id”        { default = “ami-09dd2e08d601bff67” }

variable “instance_type” { default = “t2.micro” }

variable “key_pair”      { default = “notes-app-key-pair” }

The AMI ID shown here is specifically for Ubuntu 20.04 in us-west-2. There will be other AMI IDs in other regions. The key_pair name shown here should be the key- pair name you selected when creating your key pair earlier.

It is not necessary to add the key-pair file to this directory, nor to reference the file you downloaded in these scripts. Instead, you simply give the name of the key pair. In our example, we named it notes-app-key-pair, and downloaded notes-app- key-pair.pem.

The user_data feature is very useful since it lets us customize an instance after creation. We’re using this to automate the Docker setup on the instances. This field is to receive a string containing a shell script that will execute once the instance is launched. Rather than insert that script inline with the Terraform code, we have created a set of files that are shell script snippets. The Terraform file function reads the named file, returning it as a string. The Terraform join function takes an array of strings, concatenating them together with the delimiter character in between. Between the two we construct a shell script. The shell script first installs Docker Engine, then initializes Docker Swarm mode, and finally changes the hostname to help us remember that this is the public EC2 instance.

Create a directory named sh in which we’ll create shell scripts, and in that directory create a file named docker_install.sh. To this file, add the following:

sudo apt-get update

sudo apt-get upgrade -y

sudo apt-get -y install apt-transport-https \

ca-certificates curl gnupg-agent software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg \

| sudo apt-key add –

sudo apt-key fingerprint 0EBFCD88

sudo add-apt-repository \

“deb [arch=amd64] https://download.docker.com/linux/ubuntu

$(lsb_release -cs) stable”

sudo apt-get update

sudo apt-get upgrade -y

sudo apt-get install -y docker-ce docker-ce-cli containerd.io

sudo groupadd docker

sudo usermod -aG docker ubuntu

sudo systemctl enable docker

This script is derived from the official instructions for installing Docker Engine Community Edition (CE) on Ubuntu. The first portion is support for apt-get to download packages from HTTPS repositories. It then configures the Docker package repository into Ubuntu, after which it installs Docker and related tools. Finally, it ensures that the docker group is created and ensures that the ubuntu user ID is a member of that group. The Ubuntu AMI defaults to this user ID, ubuntu, to be the one used by the EC2 administrator.

For this EC2 instance, we also run docker swarm init to initialize the Docker Swarm. For other EC2 instances, we do not run this command. The method used for initializing the user_data attribute lets us easily have a custom configuration script for each EC2 instance. For the other instances, we’ll only run docker_install.sh, whereas for this instance, we’ll also initialize the swarm.

Back in ec2-public.tf, we have two more things to do, and then we can launch the EC2 instance. Have a look at the following code block:

resource “aws_security_group” “ec2-public-sg” {

name = “${var.project_name}-public-security-group”

description = “allow inbound access to the EC2 instance”

vpc_id = aws_vpc.notes.id

ingress {

protocol = “TCP” from_port = 22

to_port = 22

cidr_blocks = [ “0.0.0.0/0” ]

}

ingress {

protocol = “TCP” from_port = 80

to_port = 80

cidr_blocks = [ “0.0.0.0/0” ]

}

egress {

protocol = “-1”

from_port = 0

to_port = 0

cidr_blocks = [ “0.0.0.0/0” ]

}

}

This is the security group declaration for the public EC2 instance. Remember that a security group describes the rules of a firewall that is attached to many kinds of AWS objects. This security group was already referenced in declaring aws_instance.public.

The main feature of security groups is the ingress and egress rules. As the words imply, ingress rules describe the network traffic allowed to enter the resource, and egress rules describe what’s allowed to be sent by the resource. If you have to look up those words in a dictionary, you’re not alone.

We have two ingress rules, and the first allows traffic on port 22, which covers SSH traffic. The second allows traffic on port 80, covering HTTP. We’ll add more Docker rules later when they’re needed.

The egress rule allows the EC2 instance to send any traffic to any machine on the internet.

These ingress rules are obviously very strict and limit the attack surface any miscreants can exploit.

The final task is to add these output declarations to ec2-public.tf, as follows:

output “ec2-public-arn”     { value = aws_instance.public.arn }

output “ec2-public-dns”     { value = aws_instance.public.public_dns }

output “ec2-public-ip”      { value = aws_instance.public.public_ip }

output “ec2-private-dns”    { value = aws_instance.public.private_dns }

output “ec2-private-ip”     { value = aws_instance.public.private_ip }

This will let us know the public IP address and public DNS name. If we’re interested, the outputs also tell us the private IP address and DNS name.

1.2. Launching the EC2 instance on AWS

We have added to the Terraform declarations for creating an EC2 instance.

We’re now ready to deploy this to AWS and see what we can do with it. We already know what to do, so let’s run the following command:

$ terraform plan

Plan: 2 to add, 0 to change, 0 to destroy. 

If the VPC infrastructure were already running, you would get output similar to this. The addition is two new objects, aws_instance.public and aws_security_group.ec2-public-sg. This looks good, so we proceed to deployment, as follows:

$ terraform apply

Plan: 2 to add, 0 to change, 0 to destroy.

 

Do you want to perform these actions?

Terraform will perform the actions described above.

Only ‘yes’ will be accepted to approve. 

Enter a value: yes

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Outputs:

aws_region = us-west-2

ec2-private-dns = ip-10-0-1-55.us-west-2.compute.internal

ec2-private-ip = 10.0.1.55

ec2-public-arn = arn:aws:ec2:us-

west-2:098106984154:instance/i-0046b28d65a4f555d

ec2-public-dns = ec2-54-213-6-249.us-west-2.compute.amazonaws.com

ec2-public-ip = 54.213.6.249

igw_id = igw-006eb101f8cb423d4

private1_cidr = 10.0.3.0/24 public1_cidr = 10.0.1.0/24

subnet_private1_id = subnet-0a9044daea298d1b2

subnet_public1_id = subnet-07e6f8ed6cc6f8397

vpc_arn = arn:aws:ec2:us-west-2:098106984154:vpc/vpc-074b2dfa7b353486f

vpc_cidr = 10.0.0.0/16

vpc_id = vpc-074b2dfa7b353486f

vpc_name = notes-vpc 

This built our EC2 instance, and we have the IP address and domain name. Because the initialization script will have required a couple of minutes to run, it is good to wait for a short time before proceeding to test the system.

The ec2-public-ip value is the public IP address for the EC2 instance. In the following examples, we will put the text PUBLIC-IP-ADDRESS, and you must of course substitute the IP address your EC2 instance is assigned.

We can log in to the EC2 instance like so:

$ ssh -i ~/Downloads/notes-app-key-pair.pem ubuntu@PUBLIC-IP-ADDRESS

The authenticity of host ‘54.213.6.249 (54.213.6.249)’ can’t be established.

ECDSA key fingerprint is

SHA256:DOGsiDjWZ6rkj1+AiMcqqy/naAku5b4VJUgZqtlwPg8.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘54.213.6.249’ (ECDSA) to the list of known hosts.

Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-1009-aws x86_64)

To run a command as administrator (user “root”), use “sudo <command>”. See “man sudo_root” for details.

ubuntu@notes-public:~$ hostname

notes-public 

On a Linux or macOS system where we’re using SSH, the command is as shown here. The -i option lets us specify the Privacy Enhanced Mail (PEM) file that was provided by AWS for the key pair. If on Windows using PuTTY, you’d instead tell it which PuTTY Private Key (PPK) file to use, and the connection parameters will otherwise be similar to this.

This lands us at the command-line prompt of the EC2 instance. We see that it is Ubuntu 20.04, and the hostname is set to notes-public, as reflected in Command Prompt and the output of the hostname command. This means that our initialization script ran because the hostname was the last configuration task it performed.

1.3. Handling the AWS EC2 key-pair file

Earlier, we said to safely store the key-pair file somewhere on your computer. In the previous section, we showed how to use the PEM file with SSH to log in to the EC2 instance. Namely, we use the PEM file like so:

$ ssh -i /path/to/key-pair.pem USER-ID@HOST-IP 

It can be inconvenient to remember to add the -i flag every time we use SSH. To avoid having to use this option, run this command:

$ ssh-add /path/to/key-pair.pem 

As the command name implies, this adds the authentication file to SSH. This has to be rerun on every reboot of the computer, but it conveniently lets us access EC2 instances without remembering to specify this option.

1.4. Testing the initial Docker Swarm

We have an EC2 instance and it should already be configured with Docker, and we can easily verify that this is the case as follows:

ubuntu@notes-public:~$ docker run hello-world

Unable to find image ‘hello-world:latest’ locally

latest: Pulling from library/hello-world

0e03bdcc26d7: Pull complete

 

The setup script was also supposed to have initialized this EC2 instance as a Docker Swarm node, and the following command verifies whether that happened:

ubuntu@notes-public:~$ docker info

Swarm: active

NodeID: qfb1ljmw2fgp4ij18klowr8dp

Is Manager: true

ClusterID: 14p4sdfsdyoa8el0v9cqirm23

 

The docker info command, as the name implies, prints out a lot of information about the current Docker instance. In this case, the output includes verification that it is in Docker Swarm mode and that this is a Docker Swarm manager instance.

Let’s try a couple of swarm commands, as follows:

ubuntu@notes-public:~$ docker node ls

ID                           HOSTNAME     STATUS AVAILABILITY MANAGER

STATUS ENGINE VERSION

qfb1ljmw2fgp4ij18klowr8dp * notes-public Ready Active Leader 19.03.9 

ubuntu@notes-public:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS 

The docker node command is for managing the nodes in a swarm. In this case, there is only one node—this one, and it is shown as not only a manager but as the swarm leader. It’s easy to be the leader when you’re the only node in the cluster, it seems.

The docker service command is for managing the services deployed in the swarm. In this context, a service is roughly the same as an entry in the services section of a Docker compose file. In other words, a service is not the running container but is an object describing the configuration for launching one or more instances of a given container.

To see what this means, let’s start an nginx service, as follows:

ubuntu@notes-public:~$ docker service create –name nginx –replicas 1

-p 80:80 nginx

ephvpfgjwxgdwx7ab87e7nc9e

overall progress: 1 out of 1 tasks

1/1: running

verify: Service converged 

ubuntu@notes-public:~$ docker service ls

ID           NAME MODE       REPLICAS IMAGE        PORTS

ephvpfgjwxgd nginx replicated 1/1      nginx:latest *:80->80/tcp 

ubuntu@notes-public:~$ docker service ps nginx

ID           NAME    IMAGE        NODE         DESIRED STATE CURRENT

STATE ERROR PORTS

ag8b45t69am1 nginx.1 nginx:latest notes-public Running Running 15

seconds ago 

We started one service using the nginx image. We said to deploy one replica and to expose port 80. We chose the nginx image because it has a simple default HTML file that we can easily view, as illustrated in the following screenshot:

Simply paste the IP address of the EC2 instance into the browser location bar, and we’re greeted with that default HTML.

We also see by using docker node ls and docker service ps that there is one instance of the service. Since this is a swarm, let’s increase the number of nginx instances, as follows:

ubuntu@notes-public:~$ docker service update –replicas 3 nginx

nginx

overall progress: 3 out of 3 tasks

1/3: running

2/3: running

3/3: running

verify: Service converged 

ubuntu@notes-public:~$ docker service ls

ID           NAME MODE       REPLICAS IMAGE        PORTS

ephvpfgjwxgd nginx replicated 3/3      nginx:latest *:80->80/tcp

ubuntu@notes-public:~$ docker service ps nginx

ID           NAME    IMAGE        NODE         DESIRED STATE CURRENT

STATE ERROR PORTS

ag8b45t69am1 nginx.1 nginx:latest notes-public Running Running 9

minutes ago

ojvbs4n2iriy nginx.2 nginx:latest notes-public Running Running 13

seconds ago

fqwwk8c4tqck nginx.3 nginx:latest notes-public Running Running 13

seconds ago 

Once a service is deployed, we can modify the deployment using the docker service update command. In this case, we told it to increase the number of instances using the –replicas option, and we now have three instances of the nginx container all running on the notes-public node.

We can also run the normal docker ps command to see the actual containers, as illustrated in the following code block:

ubuntu@notes-public:~$ docker ps

CONTAINER ID IMAGE        COMMAND                 CREATED STATUS PORTS NAMES

6dc274c30fea nginx:latest “nginx -g ‘daemon of” About a minute ago Up

About a minute 80/tcp nginx.2.ojvbs4n2iriyjifeh0ljlyvhp

4b51455fb2bf nginx:latest “nginx -g ‘daemon of” About a minute ago Up

About a minute 80/tcp nginx.3.fqwwk8c4tqckspcrrzbs0qyii

e7ed31f9471f nginx:latest “nginx -g ‘daemon of” 10 minutes ago Up 10

minutes 80/tcp nginx.1.ag8b45t69am1gzh0b65gfnq14 

This verifies that the nginx service with three replicas is actually three nginx containers.

In this section, we were able to launch an EC2 instance and set up a single-node Docker swarm in which we launched a service, which gave us the opportunity to familiarize ourselves with what this can do.

While we’re here, there is another thing to learn—namely, how to set up the remote control of Docker hosts.

2. Setting up remote control access to a Docker Swarm hosted on EC2

A feature that’s not well documented in Docker is the ability to control Docker nodes remotely. This will let us, from our laptop, run Docker commands on a server. By extension, this means that we will be able to manage the Docker Swarm from our laptop.

One method for remotely controlling a Docker instance is to expose the Docker Transmission Control Protocol (TCP) port. Be aware that miscreants are known to scan an internet infrastructure for Docker ports to hijack. The following technique does not expose the Docker port but instead uses SSH.

The following setup is for Linux and macOS, relying on features of SSH. To do this on Windows would rely on installing OpenSSH. From October 2018, OpenSSH became available for Windows, and the following commands may work in PowerShell (failing that, you can run these commands from a Multipass or Windows Subsystem for Linux (WSL) 2 instance on Windows):

ubuntu@notes-public:~$ logout Connection to PUBLIC-IP-ADDRESS closed.

Exit the shell on the EC2 instance so that you’re at the command line on your laptop. Run the following command:

$ ssh-add ~/Downloads/notes-app-key-pair.pem

Identity added: /Users/david/Downloads/notes-app-key-pair.pem

(/Users/david/Downloads/notes-app-key-pair.pem) 

We discussed this command earlier, noting that it lets us log in to EC2 instances without having to use the -i option to specify the PEM file. This is more than a simple convenience when it comes to remotely accessing Docker hosts. The following steps are dependent on having added the PEM file to SSH, as shown here.

To verify you’ve done this correctly, use this command:

$ ssh ubuntu@PUBLIC-IP-ADDRESS 

Normally with an EC2 instance, we would use the -i option, as shown earlier. But after running ssh-add, the -i option is no longer required.

That enables us to create the following environment variable:

$ export DOCKER_HOST=ssh://ubuntu@PUBLIC-IP-ADDRESS

$ docker service ls

ID           NAME MODE       REPLICAS IMAGE        PORTS

ephvpfgjwxgd nginx replicated 3/3      nginx:latest *:80->80/tcp 

The DOCKER_HOST environment variable enables the remote control of Docker hosts. It relies on a passwordless SSH login to the remote host. Once you have that, it’s simply a matter of setting the environment variable and you’ve got remote control of the Docker host, and in this case, because the host is a swarm manager, a remote swarm.

But this gets even better by using the Docker context feature. A context is a configuration required to access a remote node or swarm. Have a look at the following code snippet:

$ unset DOCKER_HOST 

We begin by deleting the environment variable because we’ll replace it with something better, as follows:

$ docker context create ec2 –docker host=ssh://ubuntu@PUBLIC-IP- ADDRESS

ec2

Successfully created context “ec2” 

$ docker –context ec2 service ls

ID           NAME MODE       REPLICAS IMAGE        PORTS

ephvpfgjwxgd nginx replicated 3/3      nginx:latest *:80->80/tcp 

$ docker context use ec2

ec2

Current context is now “ec2” 

$ docker service ls

ID           NAME MODE       REPLICAS IMAGE        PORTS

ephvpfgjwxgd nginx replicated 3/3      nginx:latest *:80->80/tcp 

We create a context using docker context create, specifying the same SSH URL we used in the DOCKER_HOST variable. We can then use it either with the — context option or by using docker context use to switch between contexts.

With this feature, we can easily maintain configurations for multiple remote servers and switch between them with a simple command.

For example, the Docker instance on our laptop is the default context. Therefore, we might find ourselves doing this:

$ docker context use default

… run docker commands against Docker on the laptop

$ docker context use ec2

… run docker commands against Docker on the AWS EC2 machines 

There are times when we must be cognizant of which is the current Docker context and when to use which context. This will be useful in the next section when we learn how to push the images to AWS ECR.

We’ve learned a lot in this section, so before heading to the next task, let’s clean up our AWS infrastructure. There’s no need to keep this EC2 instance running since we used it solely for a quick familiarization tour. We can easily delete this instance while leaving the rest of the infrastructure configured. The most effective way to so is by renaming ec2-public.tf to ec2-public.tf-disable, and to rerun terraform apply, as illustrated in the following code block:

$ mv ec2-public.tf ec2-public.tf-disable

$ terraform apply

Plan: 0 to add, 0 to change, 2 to destroy.

 

Do you want to perform these actions?

Terraform will perform the actions described above.

Only ‘yes’ will be accepted to approve.

 

Enter a value: yes

The effect of changing the name of one of the Terraform files is that Terraform will not scan those files for objects to deploy. Therefore, when Terraform maps out the state we want Terraform to deploy, it will notice that the deployed EC2 instance and security group are not listed in the local files, and it will, therefore, destroy those objects. In other words, this lets us undeploy some infrastructure with very little fuss.

This tactic can be useful for minimizing costs by turning off unneeded facilities. You can easily redeploy the EC2 instances by renaming the file back to ec2- public.tf and rerunning terraform apply.

In this section, we familiarized ourselves with Docker Swarm by deploying a single- node swarm on an EC2 instance on AWS. We first added suitable declarations to our Terraform files. We then deployed the EC2 instance on AWS. Following deployment, we set about verifying that, indeed, Docker Swarm was already installed and initialized on the server and that we could easily deploy Docker services on the swarm. We then learned how to set up remote control of the swarm from our laptop.

Taken together, this proved that we can easily deploy Docker-based services to EC2 instances on AWS. In the next section, let’s continue preparing for a production-ready deployment by setting up a build process to push Docker images to image repositories.

Source: Herron David (2020), Node.js Web Development: Server-side web development made easy with Node 14 using practical examples, Packt Publishing.

Leave a Reply

Your email address will not be published. Required fields are marked *