Deploying a Docker application to AWS Elastic Beanstalk with AWS CodeSuite (Part 1)

This is the first in a three-part series of posts on deploying a Docker application to AWS Elastic Beanstalk with AWS CodeSuite. The Docker application in question is this blog, a WordPress application backed by a MySQL database.


I’ve recently been updating the technical infrastructure for this blog. My aim has been to use AWS as much as possible, partly for convenience and partly for education. Having recently migrated the blog from EC2 to Elastic Beanstalk (EB), my latest project has been to refactor the blog’s build-and-deployment workflow to use AWS, specifically some of the services within the CodeSuite toolset. I’ll be writing some posts over the coming weeks to describe what this project involved.

Given that the new workflow ended up being rather more complicated than the old one (which was based on GitHub Actions and EB CLI), I’ll begin by summarizing the refactored version’s design.

Design summary

Following is the basic sequence of events I outlined for the new workflow:

  1. I push a commit to my GitHub repository
  2. CodeBuild pulls the code from GitHub and initiates a build
  3. CodeBuild logs in to ECR
  4. CodeBuild builds the Docker images, tags them and pushes them to ECR
  5. CodeBuild builds a source bundle for EB and pushes it to S3
  6. CodePipeline pulls the source bundle from S3 and deploys it to EB
  7. EB pulls the Docker images from ECR and starts the application

Or, to express this as a diagram…

For anyone who is not familiar with the AWS services involved in this workflow, following are some brief explanations of these:

  • CodeBuild is a continuous-integration (CI) service that orchestrates the build phase of the build-and-deployment process.
  • CodePipeline is a continuous-deployment (CD) service that orchestrates the overall build-and-deployment process.
  • ECR is a container registry that provides remote storage for Docker containers.
  • S3 is a general-purpose object storage service.
  • EB is a Platform-as-a-Service (PaaS) product that facilitates the deployment of Web applications built on different platforms, e.g., Docker, Node.js, etc.

I’ll now go into how I implemented this high-level design, starting with how I integrated my GitHub repository with AWS, given that pushing a commit to the repository needs to trigger a run on CodeBuild.

Integrating GitHub and AWS

To integrate my GitHub repository with AWS, I installed the “AWS Connector for GitHub” application to my GitHub account–applications can be installed to a GitHub account via the account’s Settings.

Once the application is installed, it’s possible to authorize it to access either all or only select repositories within an account.

Via the AWS Developer Tools Settings, I then created a connection resource. For this I just needed to choose GitHub as the Provider for the connection; AWS then allowed me to select my specific “AWS Connector for GitHub” installation. Saving the connection resulted in it being available on the Connections page of AWS Developer Tools.

With the connection between GitHub and AWS established, I was now in a position to create the CodeBuild project, the central component of the overall pipeline.

Creating the CodeBuild project

Creating a CodeBuild project generally involves two main steps:

  1. Configuring a CodeBuild project via AWS
  2. Adding a buildspec to a repository for the CodeBuild project to read from

For anyone who is not familiar, a CodeBuild project is a configurable job that focuses on the build stage of a CI/CD pipeline, while a buildspec is a YAML file that defines specific instructions for a CodeBuild project.

As I mentioned in the design summary, the two main side effects of my build stage are (1) for Docker images to be pushed to ECR and (2) for an EB source bundle to be uploaded to S3. I’ll address these specifics in a subsequent post; for the rest of this one I’ll focus on the rudiments of adding the buildspec and configuring the CodeBuild project.

Adding the buildspec

To keep things as simple as possible, then, following is an example of a skeletal buildspec:

version: 0.2
phases:
build:
commands:
- echo Hello, world!
artifacts:
files:
- 'test.txt'

In this example, “version” specifies the latest buildspec version, while “phases” specifies the commands CodeBuild should run during each phase of the build. For demo purposes I am using a single phase (“build”) and a single command (“echo Hello, world!”). Under “artifacts”, I am also specifying a single file (“test.txt”) that CodeBuild should consider to be an artifact of the build process.

Like I say, meat will be added to the bones of this buildspec in a subsequent post. For now, though, I’ll move on to discussing how to configure the CodeBuild project to read from the buildspec.

Configuring the CodeBuild project

CodeBuild projects are highly configurable. For the purposes of my buildspec, though, there were relatively few settings I needed to change from their defaults–these are itemized below. (Note the important prerequisite of creating an S3 bucket in which to store the build artifact.)

  • Project configuration
    • Project name: <PROJECT_NAME>
    • Project type: Default project
  • Source
    • Source 1 – Primary
      • Source provider: GitHub
      • Repository: Repository in my GitHub account
      • Repository: <REPOSITORY>
      • Source version: <BRANCH>
  • Buildspec
    • Build specifications
      • Use a buildspec file: true
  • Artifacts
    • Artifact 1 – Primary
      • Type: Amazon S3
      • Bucket name: <BUCKET_NAME>
      • Artifacts packaging: Zip

With the CodeBuild project thus configured, pushing a commit to my GitHub repository on the relevant branch successfully kicked off a CodeBuild run. Runs are logged in the CodeBuild project’s Build History.

As designed, the run resulted in a compressed version of the build artifact being uploaded to the configured S3 bucket.

Conclusion

In this post I’ve addressed the first two steps of the design summary I provide above: pushing a commit to GitHub and initiating a CodeBuild run. In a subsequent post I’ll aim to address the remaining CodeBuild-related steps of the design summary: logging into ECR; building and tagging the Docker images, and pushing them to ECR; and pushing the EB source bundle to S3.

SSL offloading with AWS Elastic Beanstalk and WordPress

SSL offloading is an approach to handling secure Web traffic in which the computational burden of processing encrypted requests is allocated (or “offloaded”) to a specific component within an application’s environment.

The approach can improve performance as it allows application servers to serve unencrypted requests, which are computationally less expensive than encrypted ones. It can also reduce maintenance overhead as it requires certificates to be installed only on the component that is handling encrypted requests.

The approach obviously cannot be used in environments that require end-to-end encryption; in environments that do not have this requirement, however, it can be a useful technique to employ.

In this post I will describe how SSL offloading was implemented for this blog, a WordPress application that is deployed to AWS Elastic Beanstalk (EB). In so doing I make the following assumptions:

  • The use of .ebextensions files to configure the EB environment
  • The use of the EB CLI to create the environment
  • The use of Apache HTTP Server as the WordPress application’s Web server

With these caveats out of the way, the first step toward implementing SSL offloading for this blog was to ensure the EB environment was instantiated with a load balancer, given that the load balancer is the component that will be handling encrypted requests.

Establishing the load balancer

In order for the EB environment to be instantiated with a load balancer, it was necessary to configure the environment for autoscaling. This is because, unlike single-instance environments, autoscaled environments require a load balancer in order to distribute traffic among EC2 instances. Following is the .ebextensions file that was used to ensure the load balancer was created:

option_settings:
  aws:autoscaling:launchconfiguration:
    InstanceType: {{InstanceType}}
  aws:autoscaling:asg:
    MinSize: {{MinSize}}
    MaxSize: {{MaxSize}}

The config specifies the type of EC2 instance (e.g., t3.small) autoscaling should launch within the target group, as well as the minimum and maximum number of instances that should be allowed within the group. (MinSize and MaxSize can both be set to 1 if a single instance is desired.)

With autoscaling thus configured, the next step toward implementing SSL offloading for this blog was to configure the load balancer itself.

Configuring the load balancer

Given that the load balancer needs to handle encrypted requests and that its default listener doesn’t handle such requests, it was necessary to create a new listener on the load balancer specifically for this purpose. Following is the .ebextensions file that was used to establish this listener:

Resources:
  HttpsListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      LoadBalancerArn:
        Ref: AWSEBV2LoadBalancer
      Protocol: HTTPS
      Port: 443
      DefaultActions:
        - Type: forward
          TargetGroupArn:
            Ref: AWSEBV2LoadBalancerTargetGroup
      Certificates:
        - CertificateArn: {{CertificateArn}}
      SslPolicy: {{SslPolicy}}

The config specifies the type of resource to create (AWS::ElasticLoadBalancingV2::Listener) and the properties it should be created with. Some notes on the specific properties:

  • LoadBalancerArn: References the load balancer by its logical name
  • Protocol: Specifies that the listener should listen for HTTPS requests
  • Port: Specifies that the listener should listen on port 443
  • DefaultActions: Specifies that requests should be forwarded to the load balancer’s associated target group
  • Certificates: References the SSL certificate (in this case stored in AWS Certificate Manager) that should be used to process requests
  • SslPolicy: Specifies the SSL policy that should be used to enforce standards for the requests

With the load balancer thus configured, the next step toward implementing SSL for this blog was to configure the load balancer’s security group.

Configuring the load balancer’s security group

Given that the load balancer needs to process incoming requests on port 443 and that its security group doesn’t allow such requests by default, it was necessary to create an inbound rule on the security group for this purpose. Following is the .ebextensions file that was used to establish this inbound rule:

Resources:
  HttpsIngressRule:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId:
        Ref: AWSEBLoadBalancerSecurityGroup
      IpProtocol: tcp
      FromPort: 443
      ToPort: 443
      CidrIp: 0.0.0.0/0

The config specifies the type of resource to create (AWS::EC2::SecurityGroupIngress) and the properties it should be created with. Some notes on the specific properties:

  • GroupId: References the security group for the load balancer by its logical name
  • IpProtocol: Specifies that the rule applies to TCP traffic
  • FromPort: Specifies the lowest port number the rule should apply to
  • ToPort: Specifies the highest port number the rule should apply to
  • CidrIp: Specifies that the rule should allow traffic from the outside world

(Note that FromPort and ToPort have the same value; this results in the rule limiting traffic to port 443.)

With the security group thus configured, the EB environment was now ready to be created. Running eb create against the .ebextensions files described above created and configured a load balancer, and configured the load balancer’s security group.

With the EB requirements addressed, the only remaining step in implementing SSL offloading for this blog was to configure the WordPress application’s Web server to be able to operate in the context of an SSL-offloaded environment.

Configuring the Web server

The Web server used by the WordPress installation for this blog is Apache HTTP Server (Apache). Given that encrypted requests are being offloaded to EB, Apache is free to serve unencrypted requests, which as you’ll recall is one of the benefits of SSL offloading.

In order to resolve URLs correctly, however, WordPress needs to know that a request was originally encrypted, i.e., was sent over HTTPS. As such it was necessary to configure Apache to make WordPress “context-aware.” This was done via the following customization to the Apache conf file:

<VirtualHost *:80>
  ...
  <IfModule mod_setenvif.c>
    SetEnvIf X-Forwarded-Proto "^https$" HTTPS
  </IfModule>
  ...
</VirtualHost>

The customization sets an environment variable (HTTPS) if Apache detects that a request was originally sent over HTTPS–internally WordPress reads from this variable when determining the protocol for URLs. A check is made to ensure Apache’s setenvif module is enabled. If so, the SetEnvIf directive enables the environment variable if the request has an “X-Forwarded-Proto” header with a value matching the supplied regular expression–“X-Forwarded-Proto” is an HTTP header that is sent along with requests from the EB load balancer to Apache. Note that Apache is configured to run on port 80 (VirtualHost *:80).

With Apache thus configured, SSL offloading was fully implemented for this blog.

Conclusion

While I wasn’t aware of SSL offloading as an approach prior to migrating this blog to EB, finding out about the approach and then implementing it turned out to be an added benefit of the migration, both in terms of simplifying the configuration for the blog’s development and production environments, and in terms of heightening my own awareness of the architecture that underpins an EB environment.