Provisioning EC2 instances for a full Docker swarm

So far in this chapter, we have used Terraform to create the required infrastructure on AWS, and then we set up a single-node Docker swarm on an EC2 instance to learn about Docker Swarm. After that, we pushed the Docker images to ECR, and we have set up a Docker stack file for deployment to a swarm. We are ready to set up the EC2 instances required for deploying a full swarm.

Docker Swarm is able to handle Docker deployments to large numbers of host systems. Of course, the Notes application only has delusions of grandeur and doesn’t need that many hosts. We’ll be able to do everything with three or four EC2 instances. We have declared one so far, and will declare two more that will live on the private subnet. But from this humble beginning, it would be easy to expand to more hosts.

Our goal in this section is to create an infrastructure for deploying Notes on EC2 using Docker Swarm. This will include the following:

  • Configuring additional EC2 instances on the private subnet, installing Docker on those instances, and joining them together in a multi-host Docker Swarm
  • Creating semi-automated scripting, thereby making it easy to deploy and configure the EC2 instances for the swarm
  • Using an nginx container on the public EC2 instance as a proxy in front of the Notes container

That’s quite a lot of things to take care of, so let’s get started.

1. Configuring EC2 instances and connecting to the swarm

We have one EC2 instance declared for the public subnet, and it is necessary to add two more for the private subnet. The security model we discussed earlier focused on keeping as much as possible in a private secure network infrastructure. On AWS, that means putting as much as possible on the private subnet.

Earlier, you may have renamed ec2-public.tf to ec2-public.tf-disable. If so, you should now change back the filename to ec2-public.tf. Remember that this tactic is useful for minimizing AWS resource usage when it is not needed.

Create a new file in the terraform-swarm directory named ec2-private.tf, as follows:

resource “aws_instance” “private-db1” {

ami = var.ami_id

// instance_type = var.instance_type

instance_type = “t2.medium”

subnet_id = aws_subnet.private1.id

key_name = var.key_pair

vpc_security_group_ids = [ aws_security_group.ec2-private-sg.id ]

associate_public_ip_address = false

root_block_device {

volume_size = 50

}

tags = {

Name = “${var.project_name}-ec2-private-db1”

}

depends_on = [ aws_vpc.notes, aws_internet_gateway.igw ] user_data = join(“\n”, [

“#!/bin/sh”, file(“sh/docker_install.sh”),

“mkdir -p /data/notes /data/users”,

“sudo hostname ${var.project_name}-private-db1”

])

}

resource “aws_instance” “private-svc1” {

ami = var.ami_id

instance_type = var.instance_type

subnet_id = aws_subnet.private1.id

key_name = var.key_pair

vpc_security_group_ids = [ aws_security_group.ec2-private-sg.id ]

associate_public_ip_address = false

tags = {

Name = “${var.project_name}-ec2-private-svc1”

}

depends_on = [ aws_vpc.notes, aws_internet_gateway.igw ]

user_data = join(“\n”, [

“#!/bin/sh”, file(“sh/docker_install.sh”),

“sudo hostname ${var.project_name}-private-svc1”

])

}

This declares two EC2 instances that are attached to the private subnet. There’s no difference between these instances other than the name. Because they’re on the private subnet, they are not assigned a public IP address.

Because we use the private-db1 instance for databases, we have allocated 50 gigabytes (GB) for the root device. The root_block_device block is for customizing the root disk of an EC2 instance. Among the available settings, volume_size sets its size, in GB.

Another difference in private-db1 is the instance_type, which we’ve hardcoded to t2.medium. The issue is about deploying two database containers to this server. A t2.micro instance has 1 GB of memory, and the two databases were observed to overwhelm this server. If you want the adventure of debugging that situation, change this value to be var.instance_type, which defaults to t2.micro, then read the section at the end of the chapter about debugging what happens.

Notice that for the user_data script, we only send in the script to install Docker Support, and not the script to initialize a swarm. The swarm was initialized in the public EC2 instance. The other instances must instead join the swarm using the docker swarm join command. Later, we will go over initializing the swarm, and see how that’s accomplished. For the public-db1 instance, we also create the /data/notes and /data/users directories, which will hold the database data directories.

Add the following code to ec2-private.tf:

resource “aws_security_group” “ec2-private-sg” {

name = “${var.project_name}-private-sg”

description = “allow inbound access to the EC2 instance”

vpc_id = aws_vpc.notes.id

ingress {

protocol = “-1”

from_port = 0

to_port = 0

cidr_blocks = [ aws_vpc.notes.cidr_block ]

}

ingress {

description = “Docker swarm (udp)”

protocol = “UDP”

from_port = 0

to_port = 0

cidr_blocks = [ aws_vpc.notes.cidr_block ]

}

egress {

protocol = “-1”

from_port = 0

to_port = 0

cidr_blocks = [ “0.0.0.0/0” ]

}

}

This is the security group for these EC2 instances. It allows any traffic from inside the VPC to enter the EC2 instances. This is the sort of security group we’d create when in a hurry and should tighten up the ingress rules, since this is very lax.

Likewise, the ec2-public-sg security group needs to be equally lax. We’ll find that there is a long list of IP ports used by Docker Swarm and that the swarm will fail to operate unless those ports can communicate. For our immediate purposes, the easiest option is to allow any traffic, and we’ll leave a note in the backlog to address this issue in Chapter 14, Security in Node.js Applications.

In ec2-public.tf, edit the ec2-public-sg security group to be the following:

resource “aws_security_group” “ec2-public-sg” {

name = “${var.project_name}-public-sg”

description = “allow inbound access to the EC2 instance” vpc_id = aws_vpc.notes.id

ingress {

protocol = “-1”

from_port = 0

to_port = 0

cidr_blocks = [ “0.0.0.0/0” ]

}

ingress {

description = “Docker swarm (udp)”

protocol = “UDP”

from_port = 0

to_port = 0

cidr_blocks = [ aws_vpc.notes.cidr_block ]

}

egress {

protocol = “-1”

from_port = 0

to_port = 0

cidr_blocks = [ “0.0.0.0/0” ]

}

}

This is literally not a best practice since it allows any network traffic from any IP address to reach the public EC2 instance. However, it does give us the freedom to develop the code without worrying about protocols at this moment. We will address this later and implement the best security practice. Have a look at the following code snippet:

output “ec2-private-db1-arn”  { value = aws_instance.private-db1.arn }

output “ec2-private-db1-dns”  { value = aws_instance.private- db1.private_dns }

output “ec2-private-db1-ip”   { value = aws_instance.private- db1.private_ip }

output “ec2-private-svc1-arn” { value = aws_instance.private-svc1.arn }

output “ec2-private-svc1-dns” { value = aws_instance.private- svc1.private_dns }

output “ec2-private-svc1-ip”  { value = aws_instance.private- svc1.private_ip }

This outputs the useful attributes of the EC2 instances.

In this section, we declared EC2 instances for deployment on the private subnet. Each will have Docker initialized. However, we still need to do what we can to automate the setup of the swarm.

2. Implementing semi-automatic initialization of the Docker Swarm

Ideally, when we run terraform apply, the infrastructure is automatically set up and ready to go. Automated setup reduces the overhead of running and maintaining the AWS infrastructure. We’ll get as close to that goal as possible.

For this purpose, let’s revisit the declaration of aws_instance.public in ec2- public.tf. Let’s rewrite it as follows:

resource “aws_instance” “public” {

ami = var.ami_id

instance_type = var.instance_type

subnet_id = aws_subnet.public1.id

key_name = var.key_pair

vpc_security_group_ids = [ aws_security_group.ec2-public-sg.id ]

associate_public_ip_address = true

tags = {

Name = “${var.project_name}-ec2-public”

}

depends_on = [

aws_vpc.notes, aws_internet_gateway.igw,

aws_instance.private-db1, aws_instance.private-svc1

]

user_data = join(“\n”, [

“#!/bin/sh”,

file(“sh/docker_install.sh”),

“docker swarm init”,

“sudo hostname ${var.project_name}-public”,

“docker node update –label-add type=public ${var.project_name}

-public”,

templatefile(“sh/swarm-setup.sh”, {

instances = [ {

dns = aws_instance.private-db1.private_dns,

type = “db”,

name = “${var.project_name}-private-db1”

}, {

dns = aws_instance.private-svc1.private_dns,

type = “svc”,

name = “${var.project_name}-private-svc1”

} ]

})

])

}

This is largely the same as before, but with two changes. The first is to add references to the private EC2 instances to the depends_on attribute. This will delay the construction of the public EC2 instance until after the other two are running.

The other change is to extend the shell script attached to the user_data attribute. The first addition to that script is to set the type label on the notes-public node. That label is used with service placement.

The last change is a script with which we’ll set up the swarm. Instead of setting up the swarm in the user_data script directly, it will generate a script that we will use in creating the swarm. In the sh directory, create a file named swarm-setup.sh containing the following:

cat >/home/ubuntu/swarm-setup.sh <<EOF #!/bin/sh

### Capture the file name for the PEM from the command line PEM=\$1

join=” ‘docker swarm join-token manager | sed 1,2d | sed 2d’ “

%{ for instance in instances ~}

ssh -i \$PEM ${instance.dns} \$join

docker node update –label-add type=${instance.type} ${instance.name}

%{ endfor ~}

EOF 

This generates a shell script that will be used to initialize the swarm. Because the setup relies on executing commands on the other EC2 instances, the PEM file for the AWS key pair must be present on the notes-public instance. However, it is not possible to send the key-pair file to the notes-public instance when running terraform apply. Therefore, we use the pattern of generating a shell script, which will be run later.

The pattern being followed is shown in the following code snippet:

cat >/path/to/file <<EOF

… text to output

EOF 

The part between <<EOF and EOF is supplied as the standard input to the cat command. The result is, therefore, for /home/ubuntu/swarm-setup.sh to end up with the text between those markers. An additional detail is that a number of variable references are escaped, as in PEM=\$1. This is necessary so that those variables are not evaluated while setting up this script but are present in the generated script.

This script is processed using the templatefile function so that we can use template commands. Primarily, that is the %{for .. } loop with which we generate the commands for configuring each EC2 instance. You’ll notice that there is an array of data for each instance, which is passed through the templatefile invocation.

Therefore, the swarm-setup.sh script will contain a copy of the following pair of commands for each EC2 instance:

ssh -i $PEM ${instance.dns} $join

docker node update –label-add type=${instance.type} ${instance.name} 

The first line uses SSH to execute the swarm join command on the EC2 instance. For this to work, we need to supply the AWS key pair, which must be specified on the command file so that it becomes the PEM variable. The second line adds the type label with the named value to the named swarm node.

What is the $join variable? It has the output of running docker swarm join- token, so let’s take a look at what it is.

Docker uses a swarm join token to facilitate connecting Docker hosts as a node in a swarm. The token contains cryptographically signed information that authenticates the attempt to join the swarm. We get the token by running the following command:

$ docker swarm join-token manager

To add a manager to this swarm, run the following command: 

docker swarm join –token

SWMTKN-1-1l161hnrjbmzg1r8a46e34dt21sl5n4357qrib29csi0jgi823-3g80csolwa

ioya580hjanwfsf 10.0.3.14:2377 

The word manager here means that we are requesting a token to join as a manager node. To connect a node as a worker, simply replace manager with worker.

Once the EC2 instances are deployed, we could log in to notes-public, and then run this command to get the join token and run that command on each of the EC2 instances. The swarm-setup.sh script, however, handles this for us. All we have to do, once the EC2 hosts are deployed, is to log in to notes-public and run this script.

It runs the docker swarm join-token manager command, piping that user- friendly text through a couple of sed commands to extract out the important part. That leaves the join variable containing the text of the docker swarm join command, and then it uses SSH to execute that command on each of the instances.

In this section, we examined how to automate, as far as possible, the setup of the Docker swarm.

Let’s now do it.

3. Preparing the Docker Swarm before deploying the Notes stack

When you make an omelet, it’s best to cut up all the veggies and sausage, prepare the butter, and whip the milk and eggs into a mix before you heat up the pan. In other words, we prepare the ingredients before undertaking the critical action of preparing the dish. What we’ve done so far is to prepare all the elements of successfully deploying the Notes stack to AWS using Docker Swarm. It’s now time to turn on the pan and see how well it works.

We have everything declared in the Terraform files, and we can deploy our complete system with the following command:

$ terraform apply

Plan: 5 to add, 0 to change, 0 to destroy. 

Do you want to perform these actions?

Terraform will perform the actions described above.

Only ‘yes’ will be accepted to approve. 

Enter a value: yes

 

This deploys the EC2 instances on AWS. Make sure to record all the output parameters. We’re especially interested in the domain names and IP addresses for the three EC2 instances.

As before, the notes-public instance should have a Docker swarm initialized. We have added two more instances, notes-private-db1 and notes-private-svc1. Both will have Docker installed, but they are not joined to the swarm. Instead, we need to run the generated shell script for them to become nodes in the swarm, as follows:

$ scp ~/Downloads/notes-app-key-pair.pem ubuntu@PUBLIC-IP-ADDRESS:

The authenticity of host ‘52.39.219.109 (52.39.219.109)’ can’t be established.

ECDSA key fingerprint is

SHA256:qdK5ZPn1EtmO1RWljb0dG3Nu2mDQHtmFwcw4fq9s6vM.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘52.39.219.109’ (ECDSA) to the list of known hosts.

notes-app-key-pair.pem 100% 1670 29.2KB/s 00:00 

$ ssh ubuntu@PUBLIC-IP-ADDRESS

Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-1009-aws x86_64) 

* Documentation: https://help.ubuntu.com

* Management: https://landscape.canonical.com

* Support: https://ubuntu.com/advantage

 

We have already run ssh-add on our laptop, and therefore SSH and secure copy (SCP) commands can run without explicitly referencing the PEM file. However, the SSH on the notes-public EC2 instance does not have the PEM file. Therefore, to access the other EC2 instances, we need the PEM file to be available. Hence, we’ve used scp to copy it to the notes-public instance.

If you want to verify the fact that the instances are running and have Docker active, type the following command:

ubuntu@notes-public:~$ ssh -i ./notes-app-key-pair.pem \

ubuntu@IP-FOR-EC2-INSTANCE docker run hello-world 

In this case, we are testing the private EC2 instances from a shell running on the public EC2 instance. That means we must use the private IP addresses printed when we ran Terraform. This command verifies SSH connectivity to an EC2 instance and verifies its ability to download and execute a Docker image.

Next, we can run swarm-setup.sh. On the command line, we must give the filename for the PEM file as the first argument, as follows:

ubuntu@notes-public:~$ sh -x swarm-setup.sh ./notes-app-key-pair.pem

+ PEM=./notes-app-key-pair.pem

+ ssh -i ./notes-app-key-pair.pem ip-10-0-3-151.us-

west-2.compute.internal docker swarm join –token

SWMTKN-1-04shb3msc7a1ydqcqmtyhych60wwptkxwcqiexi1ou6fetx2kg-7robjlgber

03xo44jwx1yofaw 10.0.1.111:2377

This node joined a swarm as a manager.

+ docker node update –label-add type=db notes-private-db1 notes-private-db1

+ ssh -i ./notes-app-key-pair.pem ip-10-0-3-204.us-

west-2.compute.internal docker swarm join –token

SWMTKN-1-04shb3msc7a1ydqcqmtyhych60wwptkxwcqiexi1ou6fetx2kg-7robjlgber

03xo44jwx1yofaw 10.0.1.111:2377

This node joined a swarm as a manager.

+ docker node update –label-add type=svc notes-private-svc1 notes-private-svc1 

We can see this using SSH to execute the docker swarm join command on each EC2 instance, causing these two systems to join the swarm, and to set the labels on the instances, as illustrated in the following code snippet:

ubuntu@notes-public:~$ docker node ls

ID                         HOSTNAME          STATUS AVAILABILITY

MANAGER STATUS ENGINE VERSION 

ct7d65v8lhw6hxx0k8uk3lw8m notes-private-db1 Ready Active Reachable 19.03.11

k1x2h83b0lrxnh38p3pypt91x notes-private-svc1 Ready Active Reachable 19.03.11

nikgvfe4aum51yu5obqqnnz5s * notes-public Ready Active Leader 19.03.11 

Indeed, these systems are now part of the cluster.

The swarm is ready to go, and we no longer need to be logged in to notes-public. Exiting back to our laptop, we can create the Docker context to control the swarm remotely, as follows:

$ docker context create ec2 –docker host=ssh://ubuntu@PUBLIC-IP-ADDRESS

ec2

Successfully created context “ec2”

$ docker context use ec2 

We’ve already seen how this works and that, having done this, we will be able to run Docker commands on our laptop; for example, have a look at the following code snippet:

$ docker node ls

ID                         HOSTNAME           STATUS AVAILABILITY

MANAGER STATUS ENGINE VERSION

ct7d65v8lhw6hxx0k8uk3lw8m notes-private-db1 Ready Active Reachable 19.03.11

k1x2h83b0lrxnh38p3pypt91x notes-private-svc1 Ready Active Reachable 19.03.11

nikgvfe4aum51yu5obqqnnz5s * notes-public Ready Active Leader 19.03.11 

From our laptop, we can query the state of the remote swarm that’s hosted on AWS. Of course, this isn’t limited to querying the state; we can run any other Docker command.

We also need to run the following commands, now that the swarm is set up:

$ printf ‘vuTghgEXAMPLE…’ | docker secret create TWITTER_CONSUMER_KEY –

$ printf ‘tOtJqaEXAMPLE…’ | docker secret create TWITTER_CONSUMER_SECRET – 

Remember that a newly created swarm does not have any secrets. To install the secrets requires these commands to be rerun.

If you wish to create a shell script to automate this process, consider the following:

scp $AWS_KEY_PAIR ubuntu@${NOTES_PUBLIC_IP}:

ssh -i $AWS_KEY_PAIR ubuntu@${NOTES_PUBLIC_IP} swarm-setup.sh

‘basename ${AWS_KEY_PAIR}’

docker context update –docker host=ssh://ubuntu@${NOTES_PUBLIC_IP} ec2

docker context use ec2

printf $TWITTER_CONSUMER_KEY | docker secret create TWITTER_CONSUMER_KEY –

printf $TWITTER_CONSUMER_SECRET | docker secret create TWITTER_CONSUMER_SECRET –

sh ../ecr/login.sh

This script executes the same commands we just went over to prepare the swarm on the EC2 hosts. It requires the environment variables to be set, as follows:

  • AWS_KEY_PAIR: The filename for the PEM file
  • NOTES_PUBLIC_IP: The IP address of the notes-public EC2 instance
  • TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET: The access tokens for Twitter authentication

In this section, we have deployed more EC2 instances and set up the Docker swarm. While the process was not completely automated, it’s very close. All that’s required, after using Terraform to deploy the infrastructure, is to execute a couple of commands to get logged in to notes-public where we run a script, and then go back to our laptop to set up remote access.

We have set up the EC2 instances and verified we have a working swarm. We still have the outstanding issue of verifying the Docker stack file created in the previous section. To do so, our next step is to deploy the Notes app on the swarm.

Source: Herron David (2020), Node.js Web Development: Server-side web development made easy with Node 14 using practical examples, Packt Publishing.

Leave a Reply

Your email address will not be published. Required fields are marked *