Deploying a Docker application to AWS Elastic Beanstalk with AWS CodeSuite (Part 1)

I’ve recently been updating the technical infrastructure for this blog. My aim has been to use AWS as much as possible, partly for convenience and partly for education. Having recently migrated the blog from EC2 to Elastic Beanstalk (EB), my latest project has been to refactor the blog’s build-and-deployment workflow to use AWS, specifically some of the services within the CodeSuite toolset. I’ll be writing some posts over the coming weeks to describe what this project involved.

Given that the new workflow ended up being rather more complicated than the old one (which was based on GitHub Actions and EB CLI), I’ll begin by summarizing the refactored version’s design.

Design summary

Following is the basic sequence of events I outlined for the new workflow:

  1. I push a commit to my GitHub repository
  2. CodeBuild pulls the code from GitHub and initiates a build
  3. CodeBuild logs in to ECR
  4. CodeBuild builds the Docker images, tags them and pushes them to ECR
  5. CodeBuild builds a source bundle for EB and pushes it to S3
  6. CodePipeline pulls the source bundle from S3 and deploys it to EB
  7. EB pulls the Docker images from ECR and starts the application

Or, to express this as a diagram…

For anyone who is not familiar with the AWS services involved in this workflow, following are some brief explanations of these:

  • CodeBuild is a continuous-integration (CI) service that orchestrates the build phase of the build-and-deployment process.
  • CodePipeline is a continuous-deployment (CD) service that orchestrates the overall build-and-deployment process.
  • ECR is a container registry that provides remote storage for Docker containers.
  • S3 is a general-purpose object storage service.
  • EB is a Platform-as-a-Service (PaaS) product that facilitates the deployment of Web applications built on different platforms, e.g., Docker, Node.js, etc.

I’ll now go into how I implemented this high-level design, starting with how I integrated my GitHub repository with AWS, given that pushing a commit to the repository needs to trigger a run on CodeBuild.

Integrating GitHub and AWS

To integrate my GitHub repository with AWS, I installed the “AWS Connector for GitHub” application to my GitHub account–applications can be installed to a GitHub account via the account’s Settings.

Once the application is installed, it’s possible to authorize it to access either all or only select repositories within an account.

Via the AWS Developer Tools Settings, I then created a connection resource. For this I just needed to choose GitHub as the Provider for the connection; AWS then allowed me to select my specific “AWS Connector for GitHub” installation. Saving the connection resulted in it being available on the Connections page of AWS Developer Tools.

With the connection between GitHub and AWS established, I was now in a position to create the CodeBuild project, the central component of the overall pipeline.

Creating the CodeBuild project

Creating a CodeBuild project generally involves two main steps:

  1. Configuring a CodeBuild project via AWS
  2. Adding a buildspec to a repository for the CodeBuild project to read from

For anyone who is not familiar, a CodeBuild project is a configurable job that focuses on the build stage of a CI/CD pipeline, while a buildspec is a YAML file that defines specific instructions for a CodeBuild project.

As I mentioned in the design summary, the two main side effects of my build stage are (1) for Docker images to be pushed to ECR and (2) for an EB source bundle to be uploaded to S3. I’ll address these specifics in a subsequent post; for the rest of this one I’ll focus on the rudiments of adding the buildspec and configuring the CodeBuild project.

Adding the buildspec

To keep things as simple as possible, then, following is an example of a skeletal buildspec:

version: 0.2
phases:
build:
commands:
- echo Hello, world!
artifacts:
files:
- 'test.txt'

In this example, “version” specifies the latest buildspec version, while “phases” specifies the commands CodeBuild should run during each phase of the build. For demo purposes I am using a single phase (“build”) and a single command (“echo Hello, world!”). Under “artifacts”, I am also specifying a single file (“test.txt”) that CodeBuild should consider to be an artifact of the build process.

Like I say, meat will be added to the bones of this buildspec in a subsequent post. For now, though, I’ll move on to discussing how to configure the CodeBuild project to read from the buildspec.

Configuring the CodeBuild project

CodeBuild projects are highly configurable. For the purposes of my buildspec, though, there were relatively few settings I needed to change from their defaults–these are itemized below. (Note the important prerequisite of creating an S3 bucket in which to store the build artifact.)

  • Project configuration
    • Project name: <PROJECT_NAME>
    • Project type: Default project
  • Source
    • Source 1 – Primary
      • Source provider: GitHub
      • Repository: Repository in my GitHub account
      • Repository: <REPOSITORY>
      • Source version: <BRANCH>
  • Buildspec
    • Build specifications
      • Use a buildspec file: true
  • Artifacts
    • Artifact 1 – Primary
      • Type: Amazon S3
      • Bucket name: <BUCKET_NAME>
      • Artifacts packaging: Zip

With the CodeBuild project thus configured, pushing a commit to my GitHub repository on the relevant branch successfully kicked off a CodeBuild run. Runs are logged in the CodeBuild project’s Build History.

As designed, the run resulted in a compressed version of the build artifact being uploaded to the configured S3 bucket.

Conclusion

In this post I’ve addressed the first two steps of the design summary I provide above: pushing a commit to GitHub and initiating a CodeBuild run. In a subsequent post I’ll aim to address the remaining CodeBuild-related steps of the design summary: logging into ECR; building and tagging the Docker images, and pushing them to ECR; and pushing the EB source bundle to S3.

SSL offloading with AWS Elastic Beanstalk and WordPress

SSL offloading is an approach to handling secure Web traffic in which the computational burden of processing encrypted requests is allocated (or “offloaded”) to a specific component within an application’s environment.

The approach can improve performance as it allows application servers to serve unencrypted requests, which are computationally less expensive than encrypted ones. It can also reduce maintenance overhead as it requires certificates to be installed only on the component that is handling encrypted requests.

The approach obviously cannot be used in environments that require end-to-end encryption; in environments that do not have this requirement, however, it can be a useful technique to employ.

In this post I will describe how SSL offloading was implemented for this blog, a WordPress application that is deployed to AWS Elastic Beanstalk (EB). In so doing I make the following assumptions:

  • The use of .ebextensions files to configure the EB environment
  • The use of the EB CLI to create the environment
  • The use of Apache HTTP Server as the WordPress application’s Web server

With these caveats out of the way, the first step toward implementing SSL offloading for this blog was to ensure the EB environment was instantiated with a load balancer, given that the load balancer is the component that will be handling encrypted requests.

Establishing the load balancer

In order for the EB environment to be instantiated with a load balancer, it was necessary to configure the environment for autoscaling. This is because, unlike single-instance environments, autoscaled environments require a load balancer in order to distribute traffic among EC2 instances. Following is the .ebextensions file that was used to ensure the load balancer was created:

option_settings:
  aws:autoscaling:launchconfiguration:
    InstanceType: {{InstanceType}}
  aws:autoscaling:asg:
    MinSize: {{MinSize}}
    MaxSize: {{MaxSize}}

The config specifies the type of EC2 instance (e.g., t3.small) autoscaling should launch within the target group, as well as the minimum and maximum number of instances that should be allowed within the group. (MinSize and MaxSize can both be set to 1 if a single instance is desired.)

With autoscaling thus configured, the next step toward implementing SSL offloading for this blog was to configure the load balancer itself.

Configuring the load balancer

Given that the load balancer needs to handle encrypted requests and that its default listener doesn’t handle such requests, it was necessary to create a new listener on the load balancer specifically for this purpose. Following is the .ebextensions file that was used to establish this listener:

Resources:
  HttpsListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      LoadBalancerArn:
        Ref: AWSEBV2LoadBalancer
      Protocol: HTTPS
      Port: 443
      DefaultActions:
        - Type: forward
          TargetGroupArn:
            Ref: AWSEBV2LoadBalancerTargetGroup
      Certificates:
        - CertificateArn: {{CertificateArn}}
      SslPolicy: {{SslPolicy}}

The config specifies the type of resource to create (AWS::ElasticLoadBalancingV2::Listener) and the properties it should be created with. Some notes on the specific properties:

  • LoadBalancerArn: References the load balancer by its logical name
  • Protocol: Specifies that the listener should listen for HTTPS requests
  • Port: Specifies that the listener should listen on port 443
  • DefaultActions: Specifies that requests should be forwarded to the load balancer’s associated target group
  • Certificates: References the SSL certificate (in this case stored in AWS Certificate Manager) that should be used to process requests
  • SslPolicy: Specifies the SSL policy that should be used to enforce standards for the requests

With the load balancer thus configured, the next step toward implementing SSL for this blog was to configure the load balancer’s security group.

Configuring the load balancer’s security group

Given that the load balancer needs to process incoming requests on port 443 and that its security group doesn’t allow such requests by default, it was necessary to create an inbound rule on the security group for this purpose. Following is the .ebextensions file that was used to establish this inbound rule:

Resources:
  HttpsIngressRule:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId:
        Ref: AWSEBLoadBalancerSecurityGroup
      IpProtocol: tcp
      FromPort: 443
      ToPort: 443
      CidrIp: 0.0.0.0/0

The config specifies the type of resource to create (AWS::EC2::SecurityGroupIngress) and the properties it should be created with. Some notes on the specific properties:

  • GroupId: References the security group for the load balancer by its logical name
  • IpProtocol: Specifies that the rule applies to TCP traffic
  • FromPort: Specifies the lowest port number the rule should apply to
  • ToPort: Specifies the highest port number the rule should apply to
  • CidrIp: Specifies that the rule should allow traffic from the outside world

(Note that FromPort and ToPort have the same value; this results in the rule limiting traffic to port 443.)

With the security group thus configured, the EB environment was now ready to be created. Running eb create against the .ebextensions files described above created and configured a load balancer, and configured the load balancer’s security group.

With the EB requirements addressed, the only remaining step in implementing SSL offloading for this blog was to configure the WordPress application’s Web server to be able to operate in the context of an SSL-offloaded environment.

Configuring the Web server

The Web server used by the WordPress installation for this blog is Apache HTTP Server (Apache). Given that encrypted requests are being offloaded to EB, Apache is free to serve unencrypted requests, which as you’ll recall is one of the benefits of SSL offloading.

In order to resolve URLs correctly, however, WordPress needs to know that a request was originally encrypted, i.e., was sent over HTTPS. As such it was necessary to configure Apache to make WordPress “context-aware.” This was done via the following customization to the Apache conf file:

<VirtualHost *:80>
  ...
  <IfModule mod_setenvif.c>
    SetEnvIf X-Forwarded-Proto "^https$" HTTPS
  </IfModule>
  ...
</VirtualHost>

The customization sets an environment variable (HTTPS) if Apache detects that a request was originally sent over HTTPS–internally WordPress reads from this variable when determining the protocol for URLs. A check is made to ensure Apache’s setenvif module is enabled. If so, the SetEnvIf directive enables the environment variable if the request has an “X-Forwarded-Proto” header with a value matching the supplied regular expression–“X-Forwarded-Proto” is an HTTP header that is sent along with requests from the EB load balancer to Apache. Note that Apache is configured to run on port 80 (VirtualHost *:80).

With Apache thus configured, SSL offloading was fully implemented for this blog.

Conclusion

While I wasn’t aware of SSL offloading as an approach prior to migrating this blog to EB, finding out about the approach and then implementing it turned out to be an added benefit of the migration, both in terms of simplifying the configuration for the blog’s development and production environments, and in terms of heightening my own awareness of the architecture that underpins an EB environment.

Automating AWS EC2 deployments with GitHub Actions and Systems Manager

This blog is hosted on an AWS EC2 instance; the code for it is stored in a GitHub repository. In order to update the code on the EC2 instance I previously would manually connect to the EC2 instance via SSH client, pull the code from the GitHub repository and then execute the necessary command to redeploy the code. While this got the job done, it always felt rather clunky and tedious. Last week I finally got around to automating the process. In today’s post I’ll be discussing the solution I came up with.

Essentially my approach involves integrating GitHub Actions with AWS Systems Manager to deploy the code on the EC2 instance on a push of the code to a branch of the GitHub repository. The GitHub Actions workflow consists of two steps, which are as follows:

  1. Configuring the AWS credentials
  2. Executing the deployment

The rest of this post will go into a bit more detail about each of these steps.

I’ll start by stubbing out the GitHub Actions workflow I’m using:

name: Deploy to EC2

on:
  push:
    branches:
      - master

jobs:
  deploy:
    runs-on: ubuntu-latest

    steps:
    - name: Configure AWS credentials
    ...
    - name: Execute deployment script on EC2 instance
    ...

In the “name” section I specify a name for the workflow. In the “on” section I specify that I want the workflow to run on pushes to the master branch of the repository. Finally in the “jobs” section I outline the “deploy” process–“runs-on” specifies the runner for GitHub Actions to use; “steps” specifies the steps for GitHub Actions to execute.

To configure AWS credentials I use AWS’s official configure-aws-credentials action. This action needs to be configured with the following data:

  1. The IAM user’s access key ID
  2. The IAM user’s secret access key
  3. The region of the EC2 instance to which the code is being deployed

This presupposes of course that the IAM user and an EC2 instance already exist. In my case the latter did but the former didn’t so I just went ahead and created an IAM user to represent GitHub Actions. With these things in place the complete step ended up as follows:

- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v1
  with:
    aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws-region: us-east-1
(So as not to expose the AWS credentials publicly I store them as secrets in the repo’s Security settings.)
To execute the deployment process on the EC2 instance I use a custom action built around System Manager’s send-command function. The docs reveal the function to be highly configurable; the parameters that are relevant to my use case are as follows:
  1. document-name – The name of the Amazon Web Services Systems Manager document (SSM document) to run.
  2. targets – An array of search criteria that targets managed nodes using a key-value combination that you specify.
  3. parameters – The required and optional parameters specified in the document being run.
  4. timeout-seconds – If this time is reached and the command hasn’t already started running, it won’t run.

For document-name I specify “AWS-RunShellScript”–this is a shared resource available via Systems Manager Documents that enables Systems Manager to run a shell script.

For targets, I specify “instanceids” as “Key” and the instance ID of my EC2 instance as “Values.”

For parameters, I specify a string in the following form (where <command> represents a specific instruction to provide to the EC2 instance):

'commands=[
  "<command>"
]'

Finally for timeout-seconds, I specify a value of 600 (10 minutes).

With these things in place the complete step ended up as follows:

- name: Execute deployment script on EC2 instance
  run: |
    aws ssm send-command \
      --document-name "AWS-RunShellScript" \
      --targets "Key=instanceids,Values=${{ secrets.EC2_INSTANCE_ID }}" \
      --parameters 'commands=[
        "<command>"
      ]' \
      --timeout-seconds 600

(Similar to before I store the EC2 instance’s ID as a secret in the repo’s security settings so as not to expose it publicly.)

So this is pretty much it. A push of the code to the master branch of the repo now results in the code being deployed automatically to the EC2 instance via the GitHub Action. Handily the results of the execution are available in Command History under Systems Manager > Run Command.

Clicking into the detail of a command exposes further info such as output and error logging, plus the ability to re-run the command from Systems Manager itself.

All in all this was a fun little project that took a day or two of tinkering to get working.

Accessing an AWS EC2 instance via Session Manager

This blog is currently hosted on AWS EC2. Until recently I would always connect to my EC2 instance via SSH client. An alternative approach I learned of recently is to connect via Session Manager, a feature of AWS Systems Manager. A main benefit of Session Manager is that it removes the need to open inbound ports to the instance or manage SSH keys as Session Manager handles these security details for you.

Using Session Manager involves a few prerequisites, which can be reduced to the following three-step process:

  1. Provisioning the EC2 instance with the SSM* agent
  2. Provisioning the EC2 instance with an IAM role
  3. Restarting the SSM agent to detect the IAM role

* Simple Systems Manager

Detailed instructions follow. Note that these instructions are specific to Ubuntu 14.04, which I appreciate is quite outdated at time of writing. Steps 1 and 3 require you to be connected to an EC2 instance (for example via SSH client). Step 2 requires you to be logged in to the AWS Management Console.

Provisioning the EC2 instance with the SSM agent

The first main step toward connecting to an EC2 instance via Session Manager is to install the SSM agent on the EC2 instance. For my OS this involved running the following commands against the instance:

// Update the OS package index
sudo apt-get update

// Download the SSM agent package
wget https://s3.amazonaws.com/amazon-ssm-us-east-1/latest/debian_amd64/amazon-ssm-agent.deb

// Install the SSM agent package
sudo dpkg -i amazon-ssm-agent.deb

// Start the SSM agent
sudo start amazon-ssm-agent

// Verify the SSM agent status
sudo status amazon-ssm-agent

This last command should produce output like the following:

amazon-ssm-agent start/running, process 4180

Provisioning the EC2 instance with an IAM role

The second main step toward connecting to an EC2 instance via Session Manager is to provision the EC2 instance with an IAM role granting permission to Session Manager to connect to the instance. This step involves (1) creating the IAM role and (2) attaching the role to the instance.

Create an IAM role for the EC2 instance

From the AWS Management Console go to IAM. From the left nav click Roles and from the top-right click “Create Role.” You should be taken to a three-step wizard for creating an IAM role.

The first step is to select the trusted entity for the role. For “Trusted Entity Type” choose “AWS Service.” For “Use Case” choose EC2 as Service and “EC2 Role for AWS Systems Manager” as “Use Case.” You can then proceed to the next step of the wizard.

The second step is to add permissions to the role. All you should need to do for this step is to verify that the relevant policy (AmazonSSMManagedInstanceCore) is attached to the role, which the use case chosen in the previous step should take care of automatically. You can then proceed to the next step of the wizard.

The last step is to name, review, and create the role. Under “Role details” add a name and description for the role. Then create the role and verify that it was created successfully.

Attach the IAM role to the EC2 instance

Still in the AWS Management Console go to EC2. From the left nav click Instances and from the Instances pane select the relevant instance. From the Security submenu of the Actions menu select “Modify IAM Role.” From the “IAM role” menu select the role you created in the previous step. Then click “Update IAM role.”

From the Instances pane select the relevant instance again (assuming it’s not already selected). Verify that the role is attached to the instance–it should be listed under the “IAM Role” heading in the Details tab of the Instances pane.

Still in the Instances pane and with the relevant instance still selected, click Connect. From the “Connect to instance” page select the “Session Manager” tab. You should be presented with a page that resembles the following screenshot:

Note (a) the disabled Connect button and (b) the warning about the instance not being connected to Session Manager–both would be expected at this stage since it’s necessary to restart the SSM agent in order for the instance to detect the updated IAM role.

Restarting the SSM agent to detect the IAM role

The last main step, then, toward connecting to an EC2 instance via Session Manager is to restart the SSM agent in order for EC2 to detect the updated IAM role. For my OS this involved running the following command against the instance:

sudo restart amazon-ssm-agent

Back in the AWS Management Console refreshing the “Connect to instance” page should result in a page that resembles the following screenshot:

Note that (a) the warning about the instance not being connected to Session Manager has disappeared and (b) the Connect button has been enabled. If you go ahead and click the Connect button you should be presented with a browser-based terminal from which you can run commands against the EC2 instance.

Congratulations! You’ve now successfully connected to an EC2 instance via Session Manager.

Creating queues with ES6

In my last post I discussed creating a binary search tree with ES6. In this post I’ll be discussing a different type of data structure: queues. Once again I’ll be leaning on Data Structures and Algorithms With JavaScript by Michael McMillan for insight.

What is a queue?

A queue is a linear data structure that stores items in the order in which they are generated. A queue is rather like a list where items are added to the end and removed from the beginning. This type of data structure is known as a “first-in, first-out” data structure. It may help to think of a queue as a line at a grocery store where customers join at the back and check out at the front.

Creating a queue

Creating a queue requires a single class. The class should have one property for storing the data along with several standard methods for working with the data, e.g., adding items to the queue, removing items from the queue and querying the queue. Exact property and method names may vary but such a class may be designed as follows:

// A basic queue
class Queue {
  // Creates the data store
  constructor(dataStore = []) {
    this.dataStore = dataStore;
  }
  // Adds an element to the back of the queue
  push(element) {
    this.dataStore.push(element);
  }
  // Removes an element from the front of the queue
  shift() {
    this.dataStore.shift();
  }
  // Inspects the first element in the queue
  peekFront() {
    return this.dataStore[0];
  }
  // Inspects the last element in the queue
  peekBack() {
    return this.dataStore[this.dataStore.length - 1];
  }
  // Checks to see if the queue is empty
  isEmpty() {
    return !this.dataStore.length;
  }
  // Outputs the contents of the queue
  toString() {
    let str = '';
    for (var i = 0; i < this.dataStore.length; i++) {
      str += `${this.dataStore[i]}\n`;
    }
    return str;
  }
}

This simple class essentially proxies native array properties and methods in order to work with the data. For example the push() method that adds items to the queue proxies Array.prototype.push(); the shift() method that removes items from the queue proxies Array.prototype.shift(); and the isEmpty() method that checks to see if the queue is empty proxies Array.length. The class also has methods for inspecting the first and last elements in the queue (peekFront() and peekBack()), and outputting the contents of the queue (toString()).

Let’s now create a queue and add some items to it:

const queue = new Queue();
queue.push('George Washington');
queue.push('John Adams');
queue.push('Thomas Jefferson');
queue.push('James Madison');
queue.push('James Monroe');

Outputting the contents of the queue should return the following:

George Washington
John Adams
Thomas Jefferson
James Madison
James Monroe

Notice how each new item has been added to the back of the queue?

Let’s now remove an element from the queue using queue.shift(); and see how this affects the output:

John Adams
Thomas Jefferson
James Madison
James Monroe

Notice how the first item has been removed from front of the queue?

Let’s now inspect the first and last items in the queue:

queue.peekFront(); // John Adams
queue.peekBack(); // James Monroe

So far, so predictable.

Creating a double-ended queue

A more specific kind of queue is called a double-ended queue or “deque” (pronounced “deck”). In a deque items can be added to and removed from both the front and the back of the queue. Creating a deque requires us to extend our basic queue with a couple of extra methods: an unshift() method for adding items to the front of the queue and a pop() method for removing items from the back of the queue. Again these methods proxy the native array methods Array.prototype.unshift() and Array.prototype.pop().

class Deque extends Queue {
  ...
  // Adds an element to the front of the queue
  unshift(element) {
    this.dataStore.unshift(element);
  }
  // Removes an element from the back of the queue
  pop() {
    this.dataStore.pop();
  }
  ...
}

Let’s now create a deque and add some items to it:

const deque = new Deque();
deque.unshift('George Washington');
deque.unshift('John Adams');
deque.unshift('Thomas Jefferson');
deque.unshift('James Madison');
deque.unshift('James Monroe');

Outputting the contents of the queue should return the following:

James Monroe
James Madison
Thomas Jefferson
John Adams
George Washington

Notice how adding the items to the front of the queue affects the order?

Let’s now remove an item from the queue with deque.pop(); and see how this affects the output:

James Monroe
James Madison
Thomas Jefferson
John Adams

Notice how the item has been removed from the back of the queue?

Let’s now inspect the first and last elements in the queue:

deque.peekFront(); // James Monroe
deque.peekBack(); // John Adams

Straightforward enough!

Creating a priority queue

Another more specific kind of queue is called a priority queue. In a priority queue items are removed based on a manually defined “priority” as opposed to an automatically defined position (first or last).

As an example let’s take the line of succession to the U.S. presidency, in which the successor to the office is based on a set order of priority. A simple data model for a successor could look like this:

office: String // office to which successor belongs
priority: Number // order of priority

Creating a line of succession class once again requires us to extend our basic queue with a few methods: a special implementation of the shift() method for removing items from the queue, a special implementation of the toString() method for outputting the contents of the queue, and a count() method for returning the number of items in the queue.

class LineOfSuccession extends Queue {
  // Removes an element from the queue based on priority 
  shift() {
    let order = 0;
    for (var i = 1; i < this.count(); ++i) {
      if (this.dataStore[i].order < this.dataStore[order].order) {
        order = i;
      }
    }
    return this.dataStore.splice(order, 1);
  }
  // Outputs the contents of the queue
  toString() {
    let retStr = ``;
    for (var i = 0; i < this.dataStore.length; i++) {
      retStr += `${this.dataStore[i].office}\n`;
    }
    console.log(retStr);
  }
}

The shift() method works by returning the item with the highest priority from the queue. It does this by looping through all the items in the queue and upon encountering a higher priority item than the current highest priority item making the former the new highest priority item.

Let’s now create a line of succession:

const los = new LineOfSuccession([
  {office: 'Speaker of the House of Representatives', order: 2},
  {office: 'Vice President', order: 1},
  {office: 'Secretary of the Treasury', order: 5},
  {office: 'Secretary of State', order: 4},
  {office: 'President pro tempore of the Senate', order: 3}
]);

Notice how this time we’re passing the data into the queue’s constructor rather than adding the items manually with queue.push()? Also notice how the data is in no particular order as it’s being passed in? Outputting the contents of the queue should return the following:

Speaker of the House of Representatives
Vice President
Secretary of the Treasury
Secretary of State
President pro tempore of the Senate

Now let’s create a successor variable and start pulling (removing) successors from the queue.

let successor;
successor = los.shift();
successor[0].office // Vice President;
successor = los.shift();
successor[0].office // Speaker of the House of Representatives;
successor = los.shift();
successor[0].office // President pro tempore of the Senate;
successor = los.shift();
successor[0].office // Secretary of State;
successor = los.shift();
successor[0].office // Secretary of the Treasury;

Notice how each successor is being removed from the queue based on priority?

Conclusion

In this post I’ve described the basic idea of the queue data structure and, to see how it works in practice, used ES6 to implement a few different kinds of queue: a basic queue, a double-ended queue and a priority queue. The main differences between these kinds of queue can be summarized as follows:

  • In a basic queue items are added to the back and removed from the front.
  • In a doubled-ended queue items can be added to and removed from both the front and the back.
  • In a priority queue items are removed based on a manually defined priority.

Creating a binary search tree with ES6

I recently started reading Data Structures and Algorithms With JavaScript by Michael McMillan. Not having an academic background in computer science I’ve tended to shy away from this subject. With front-end development becoming an ever more complex endeavor, however, I felt it was about time to dive in and see what I’ve been missing. This and somebody recently asked me a question about binary search trees, about which I was utterly clueless. Guilt can be a good motivator, I guess.

What are trees?

McMillan defines a tree as a “nonlinear data structure that is used to store data in a hierarchical manner.” In this context a nonlinear data structure can be defined as a data structure in which data is arranged randomly, while a hierarchical data structure can be defined as a data structure in which data is organized into levels. A specific terminology is used when discussing trees. Some terms I’ll be using in this post include:

  • Root
  • Child
  • Parent
  • Leaf
  • Edge
  • Path
  • Level
  • Depth
  • Key value

Binary trees and binary search trees are special kinds of tree. In a binary tree, a node can have no more than two child nodes; in a binary search tree (BST), lesser values are stored in left nodes and greater values are stored in right nodes. The following diagram depicts a binary search tree.

A binary search tree with three levels. The root has a key value of 4 and has children with key values of 2 and 6. Both these nodes also have children of their own: The node with a key value of 2 is parent to nodes with key values of 1 and 3; the node with a key value of 6 is parent to nodes with key values of 5 and 7. All nodes on level 2 are leaves.

In this post I’ll be creating this BST using ES6 and adding some methods to it for adding and retrieving data. The code for my creation is available on CodePen.

Creating the BST

Creating the empty BST turns out to be relatively straightforward. All that’s needed is a class to represent a node and a class to represent the BST. A node holds references to the data it’s supposed to store as well as to its children (left and right nodes). The BST holds a reference to the root, which starts out as null. The basic classes end up looking like this:

class Node {
  constructor(data, left = null, right = null) {
    this.data = data;
    this.left = left;
    this.right = right;
  }
}

class BST {
  constructor() {
    this.root = null;
  }
}

Notice how the values of a node’s children are initialized using ES6 default parameters. Creating the BST is a simple matter of instantiating the BST class: const bst = new BST();.

Adding nodes to the BST

So far so good but an empty tree isn’t much use to anyone. In order to add nodes to the tree we’re going to need a method for doing so. Following is the insert() method McMillan defines, translated to ES6 from his ES5:

class BST {
  ...
  insert(data) {
    const node = new Node(data);
    if (this.root === null) {
      this.root = node;
    } else {
      let current = this.root;
      let parent;
      while(true) {
        parent = current;
        if (data < current.data) {
          current = current.left;
          if (current === null) {
            parent.left = node;
            break;
          }
        } else {
          current = current.right;
          if (current === null) {
            parent.right = node;
            break;
          }
        }
      }
    }
  }
}

The insert() method works by creating a new node and passing any data it was passed into the new node’s constructor. The method then does one of two things:

  1. If the BST doesn’t have a root, it makes the new node the root.
  2. If the BST does have a root, it traces a path through the BST until it finds an insertion point for the new node. Essentially this involves determining whether the new node should be inserted as the left or right child of a given parent. This is based on whether the new node’s value is lesser or greater than the parent’s value.

So let’s go ahead and insert some nodes and see how this works in practice.

bst.insert(4);
bst.insert(2);
bst.insert(6);
bst.insert(1);
bst.insert(3);
bst.insert(5);
bst.insert(7);

Following is a table that illustrates the inner workings of the insert() method for each of the values we’re inserting. (A key to the column headings follows the table.)

1234567
4nulln/an/an/an/ainsert
244trueleftnullinsert
644falserightnullinsert
144trueleft2iterate
n/a42trueleftnullinsert
344trueleft2iterate
n/a42falserightnullinsert
544falseright6iterate
n/a46trueleftnullinsert
744falseright6iterate
n/a46falserightnullinsert

  1. New node value
  2. Root node value
  3. Current node value
  4. New node value < current node value?
  5. New node should be inserted to left or right?
  6. Value of node at insertion point
  7. Result

Retrieving the minimum and maximum values from the BST

Two important implications of the insert() method are that:

  • The minimum value in the BST is always the leftmost value in the BST.
  • The maximum value in the BST is always the rightmost value in the BST.

Given these rules, defining methods to retrieve these values becomes fairly trivial.

Retrieving the minimum value

Let’s define a getMin() method for retrieving the minimum value from the BST:

class BST {
  ...
  getMin() {
    let current = this.root;
    while(current.left !== null) {
      current = current.left;
    }
    return current;
  }
}

The method can be called with a simple bst.getMin();. The following table illustrates the method’s inner workings:

Current nodeLeft nodeResult
42iterate
21iterate
1nullreturn

Retrieving the maximum value

Let’s now define a getMax() method for retrieving the maximum value from the BST:

class BST {
  ...
  getMax() {
    let current = this.root;
    while(current.right !== null) {
      current = current.right;
    }
    return current;
  }
}

This method can be called with a simple bst.getMax();. The following table illustrates the method’s inner workings:

Current nodeRight nodeResult
46iterate
67iterate
7nullreturn

Finding a specific node in the BST

Finding a specific node in the BST is a matter of tracing a path through the BST until either a value is found that matches the requested value or a value of null is found, in which case it can be safely said that the BST does not contain the requested value. Following is the find() method McMillan defines, once again translated to ES6 from his ES5:

class BST {
  ...
  find(data) {
    let current = this.root;
    while (current.data !== data) {
      if (data < current.data) {
        current = current.left;
      } else {
        current = current.right;
      }
      if (current === null) {
        return null;
      }
    }
    return current;
  }
}

Let’s try to find the node with a value of 3 by calling the method with bst.find(3);. Following is a table that illustrates the method’s inner workings. (A key to the column headings follows the table.)

123456
4falsetrueleft2iterate
2falsefalseright3iterate
3truen/an/an/areturn

  1. Current node value
  2. Is the current node value equal to the requested node value equal?
  3. Is the requested node value less than the current node value?
  4. Is the new current node to the left or right of the existing current node?
  5. New current node value
  6. Result

Conclusion

In this post we learned to differentiate between trees, binary trees and binary search trees (BSTs). We also created a BST using ES6 and added some methods to it for adding and retrieving data. Unfortunately we didn’t have time to cover some more advanced BST topics such as tree traversal and removing nodes–maybe this can be the subject of a future post.

Organizing Passwords, PII and Email

As I prepare to start a new job on Monday, the first new job I’ll have started for more than seven years (God, please don’t let me get beaten up behind the bike sheds on my first day!), I’ve been making an effort to organize the detritus of my online life.

Loyal readers of this blog—if indeed such emotionally troubled individuals are currently blessed with reading privileges in the various jailhouses and detention centers they occupy—will recall my inaugural post about a password manager application I’ve been using called 1Password, made by a Canadian software company called AgileBits. As the name “1Password” suggests, the (IMO truly magnificent) app allows you to create one strong, memorable password that gives you access to a database into which you can enter all your other Web site log-in credentials, the passwords for which can be as strong and unmemorable as you wish. As if this functionality were not useful enough, the app also gives you the ability to log in to your favorite Web sites automatically, which in effect frees you from the very 21st-century hassle of entering slight variants of the same credentials into Web-based log-in forms time after tedious time.

In more recent days I’ve also been exploring some of 1Password’s additional capabilities, such as the way it allows you to store personal information (e.g., bank account, credit card and driver’s license information), in what the app calls “Wallet items.” Given my increasing reliance on the Web for performing financial and commercial transactions, having this information stored centrally in such a fashion has been proving pretty useful. Now, for instance, I don’t have to search my coat pockets for my actual wallet whenever I need my credit card for Turnov’s Small and Short men’s shop, a card whose number, expiry date and security code have escaped—or indeed have never been stored in—my memory. This liberating upshot is all the more satisfying given the numerous types of coat I’ve had to wear this winter here in the environs of Alexandria, Va., where the weather has been changing as much as a chameleon with multiple personality disorder in a dressing room.

To turn to, or at least look askance at, the subject of email organization, I’ve also been striving to exert a measure of control over my hitherto untamed Gmail inbox. This effort has necessitated creating a large number of labels and a not small number of filters. I now have as many as 17 top-level labels for my received mail, each of which has sub-labels that allow for more specific content characterization. Without wishing to give you too much insight into my personal life, here are the top-level labels I’ve settled on to-date:

  • Banking
  • Car (“Pollution-Spewing Heap of Rust” would be more accurate)
  • Credit Cards
  • Entertainment
  • Health & Fitness (just the one email under this label so far)
  • Home
  • Insurance
  • Investments
  • Local Government
  • Meetups (nice to get out of Chez Smith from time to time)
  • Personal
  • Personal Finance
  • Shopping
  • Taxes (would “Death and Taxes” last longer?)
  • Travel & Transit
  • Web Services
  • Work

Readers who take their mobility for granted might argue that going to such organizational lengths as these evinces a certain anal-retentiveness in me. And while I wouldn’t render them entirely unable to walk for arguing this way, I would suggest that if the feeling one gets from having an organized inbox is preferable to the feeling one gets from having a disorganized inbox, then the better feeling alone justifies the effort one has made in organizing it. So there, with highly polished brass knobs on!

In conclusion, I think “personal information architecture” projects like these will increase in importance as our weird, wired world becomes more data-centric and IT-reliant. Instead of finishing with a supposition, however, I’ll finish with a question: Do you—my loyal, incarcerated readers—attempt to organize your inboxes? Or do you allow them to grow untamed, like the tangled hair of an aging hippie?

1Password: The Summit of Password Managers?

If I were to suggest to you that global warming, population growth and password management were among the 21st century’s greatest problems, you would of course be perfectly justified in observing that only two of these topics deserve such weighty description: As we all know, the dangers to mankind of global warming are vastly overstated and in fact may be apocryphal.

Before any partisan bickering that my previous assertion might have provoked descends into outright violence, allow me to raise my voice above the din for a moment to plead, “I was only joking!” Compared to the mountainous environmental and demographic issues already identified, the personal tech problem of managing one’s Web site login credentials seems molehill-like indeed. Modest though its heights may be, however, they still must be scaled. It pleases me to report that with the aid of a trusty Sherpa I have in recent days been able to scale them!

Playing Tenzing Norgay to my Edmund Hillary is 1Password, a desktop password-management application from the privately held Canadian company AgileBits. According to the manufacturer’s Web site, versions for Mac, Windows, iPhone, iPad and Android are available. This discussion concerns 1Password 3 for Mac, which requires OS X Snow Leopard and higher. (Tiger and Leopard users, the manufacturer mollifies, can use 1Password 2 and sync their data with Snow Leopard or Lion.) A single-user license cost me $49.99 from the manufacturer’s Web site.

The central idea of 1Password is straightforward:

  1. You create a single, master password to the app.
  2. You enter your login credentials for any given Web site into the app’s database.
  3. The app, upon entry of the correct master password, gives you access to all the credentials you’ve entered.

The upshot is that you only have to remember one password (hence the name, duh!) in order to have access to your login credentials for any given site. This comes in especially handy if, like me, you have accumulated several variations of usernames and passwords since Web time (Berners-Lee time?) began. I, for instance, have amassed as many as 20 unique usernames!

Having only one password to remember also enables you to practice good Web security and create a strong, unique password for each of your stored sites. 1Password conveniently includes a random-password generator for the purpose. The generator features several password-customization options, including the number of characters the password should contain, the number of digits or special characters it should contain, whether it can contain the same character more than once, and even whether it should be pronounceable!

1Password also includes a delightfully time-saving auto-login feature, which alone justifies the app’s license fee IMHO. Double-click the name of any given Web site from within the app to be instantly directed to and logged into that site. For even quicker, one-click access, install the app’s Web browser extension, in which you can also perform other common tasks, such as data entry.

What happens if a Web site requires you to enter your username on one screen and your password on another? Simply create two records in the database: one for the username screen and one for the password screen, taking care to name each record meaningfully. Then simply choose the relevant record from the app or browser extension.

Do you use a Mac at home and a PC at work? Combine 1Password with the built-in 1PasswordAnywhere and the file-hosting service Dropbox for access to your login credentials from wherever you happen to be doing your Web browsing. In addition to login credentials, 1Password also promises the ability to store other types of sensitive information, such as software licenses, free-form text, and personal and financial information. I’m looking forward to trying these features out in the coming days.

In the meantime you can color me impressed. Although there may be freely available alternatives that perform just as well, I feel as though my money has been well-spent on 1Password. A piece of software that saves me time and makes me feel a bit more organized seems to me invaluable in this time-crunched era of ours. If these do not strike you as reasons enough to reach for the summit of your own password management mountain (or molehill), then why not turn to the sentiments of another celebrated Everest assaulter: Do it because it’s there!