Infrastructure as Code, Part One

Let's say you've developed a new feature and you want to release it.

You've avoided all the typical pitfalls when it comes to making a new release and you've done everything as you were meant to. It's not a Friday, it's not 5 pm, and so on.

But your organization is still doing manual releases. So that means that a systems administrator is logging on to each one of your production machines and deploying the latest version of the code.

At your organization, the rules demand that you submit a rollout request in advance. And in this instance, you are granted a rollout window for Tuesday afternoon. Your application changes have already been successfully deployed in the staging environment. The tests pass, and everything else has gone smoothly. You're feeling confident.

Tuesday afternoon rolls around, and it's time to make the release to the production environment.

The deployment is successful on the first machine. And the second machine. But wait. Something goes wrong on the third machine.

It turns out that the third production server has a different set of application dependencies installed. The versions are incompatible.

You start debugging, but there's a fixed time window for the deployment, and time is running out...

Eventually, time runs out. You've missed the window. The entire release is rolled back. And now you have to request another rollout. But approval from stakeholders is slow, so the next opportunity is a week away.

Damn.

You spend the rest of the day investigating what went wrong. Eventually, you figure it out. Somebody logged on to the third machine last week and manually updated some of the software. These changes were never propagated to the other servers, or back to the staging environment, or dev environments.

Does any of this feel familiar? If so, you're not alone.

Fortunately, Infrastructure as Code (IaC) can help you mitigate all of the problems described above. In part one of this IaC miniseries, I will introduce you to the basic concepts and explain some of the benefits.

An Introduction

As the name suggests, Infrastructure as Code uses code to provision, configure, and manage infrastructure.

Using the right set of tools, it is straightforward to create a description of the infrastructure on which your application should be deployed. This description includes the specification of virtual machines, storage, software stacks, network configurations, security features, user accounts, access control limits, and so on.

This description is done using code, often using a declarative language. The language you use varies depending on the tools you use, from common scripting languages to Domain Specific Languages (DSL) provided by the tools.

IaC has evolved alongside the emergence of Infrastructure as a Service (IaaS) and other cloud-based services. The programmability of IaaS and the declarative nature of IaC work very well together. Instead of setting up that cloud environment by hand each time, you can just automate it with IaC.

But that doesn't mean that IaC limits you to IaaS, whether public, private, or hybrid. With a little extra work, you can use infrastructure configuration tools to manage a traditional collection of physical servers.

Alternatives

Before I continue, it would be remiss of me not to mention the other options.

There are three main ones that I can think of:

Manually setting up your infrastructure using the visual console provided by your cloud provider. For example, using the Azure Portal by hand to set up your Microsoft Azure products, one by one. Clicking around the interface, creating a new VM, choosing the operating system from a drop-down menu, launching it, monitoring the status, and so on.
Using the CLI tool provided by a cloud provider.
Instead of using a cloud provider, you're managing your own physical machines or virtual machines. And you have written your own collection of configuration tools, management tools, deployment scripts, and so on.

If you're using a cloud provider, the console is a great way to learn the ropes. But this quickly grows tiresome and error-prone if you try to manage your whole setup like this. And in addition, there's usually no built-in change visibility. You have to remember what actions you took, and document them for the rest of the team.

And if you're managing your own servers by hand, developing your own system administration scripts can work okay. But it's easy for such a system to become a bit of a non-standard hodgepodge.

And, well, things can quickly get out of hand...

The Landscape

Okay, let's take a high-level look at the IaC landscape.

There are basically two categories of tools:

Orchestration tools are used to provision, organize, and manage infrastructure components. Examples include Terraform, AWS CloudFormation, Azure Resource Manager.
Configuration management tools are used to install, update, and manage the software running on the infrastructure components. Examples include SaltStack, Puppet, Chef, and Ansible.

Additionally, when it comes to IaC, there are two primary approaches to managing infrastructure:

Some tools (e.g., SaltStack) treat infrastructure components as mutable objects. That is, every time you make a change to the configuration, the necessary set of changes are made to the already-running components.
Other tools (e.g., Terraform) treat infrastructure components as immutable objects. That is, when you make a change, new components with the new configuration are created to replace the old components.

What Are the Benefits?

Developer Mindset

Developers ideally concern themselves with creating stable and sustainable software. But when infrastructure is "something those other people take care of," it's easy for developers not to consider how the software they are building is going to be run.

When you manage the infrastructure using code and involve the application developers in that process, it can prompt a change in mindset.

How is this application going to be deployed? How is it going to be maintained once it's running? How are upgrades done? Have you thought about all the ways it might fail on a production machine? What preventative measures can you take?

All of this and more becomes a natural part of the application development process itself when it is integrated with IaC.

Version Control

Because you are defining your infrastructure with code, it should also be versioned in a repository like the rest of your code.

All of the benefits that change control offers application development are made available for infrastructure management:

A single source of truth.
The code itself, and the way it is laid out is a communication tool and can be used to understand the way things are built.
The history of changes is recorded and accessible so you can understand what has changed and why over time.
Coding standards can be developed and maintained.
Pull requests and code reviews increase knowledge sharing and improve the overall quality of the code.
Experimental changes can be made on branches and tested in isolation.
Making changes is less daunting because if something goes wrong, you can revert back to a previously known working version.

And so on, and so on...

Knowledge Sharing

It's a widespread phenomenon that one or two people in an organization will accumulate critical information that only they seem to possess.

[At one of my previous companies, we kept a list of such individuals called the "red bus" list, so named because of the risk that one of them might be hit by one. – Ed.]

Less dramatically, such people take holidays and sometimes get sick, like anyone. So you should be able to cope with them not being around.

With your infrastructure documented using code, it is hopefully relatively straightforward to understand. And what's more, this is a type of living documentation that should always be up-to-date.

Better Use of Time

In general, humans are bad with tedious, repetitive tasks. We lose interest, and that's when we're most likely to make mistakes.

Fortunately, this is what machines are good at.

When you manage your infrastructure with code, you offload all of the tedious, repetitive work to computers. And as a result, you reduce the chance of inconsistencies, mistakes, incomplete work, and other forms of human error.

This also allows your people to spend their time and energies on other more important, high-value tasks

Continuous Integration

Continuous Integration (CI) is the practice of team members merging their code changes into a mainline branch multiple times per day.

The benefit is that you are continually validating your code against your test suites, and you can adapt to changes being made to the codebase in a manageable, incremental fashion, as and when it happens, and not all in one go at the end of your project (what has been termed "integration hell").

IaC improves CI because changes to the infrastructure can be tested like changes to the application code. Additionally, a temporary testing environment can be provisioned for every change to application code.

Continuous Delivery

Continuous Delivery (CD) is the process of making regular automated releases every time code is pushed to the mainline branch.

CD and IaC are a match made in heaven.

With IaC, you can set up a deployment pipeline that automates the process of moving different versions of your application from one environment to the next as your testing and release schedules dictate. But more on that topic in another post. :)

Wrap Up

Infrastructure as Code (IaC) bridges the gap between system administrators and developers, and in doing so it:

Helps developers think about the entire lifecycle of the software they are developing.
Brings version control, code reviews, knowledge sharing, and other benefits to systems administration.
Enables you to radically transform your CI and CD pipeline.
Reduces the need for repetitive and error-prone tasks, freeing people up to work on more exciting things.

The end result should be a more consistent, reliable, and well-understood infrastructure that is easier to develop and easier to maintain.

In part two of this miniseries, I dive into the details and show you some of the ways we're using Terraform at Crate.io for provisioning and managing infrastructure.

Read part two of the series

Infrastructure as Code, Part One

An Introduction

Alternatives

The Landscape