Using Terraform workspaces for fun and profit – Part 2

In the first part of this series, we built a Terraform system that uses a single set of files to maintain 3 separate environments, the next step is to automate as much of this process as possible.

To do that, we’re going to leverage Azure DevOps to create a build pipeline for each environment, which will kick off when we merge code into the respective environment’s git branch.

Step 1: Create a git repo to store the .tf files

First you’ll need to sign into Azure DevOps. Once signed in, you’ll want to either select an existing organization, or create a new one, and then select an existing project, or create a new one.

Once you’ve got a project open, the + symbol at the top of the left-hand panel and select ‘New repository’. Create a repository using the naming convention of your company (or if you don’t have a particular convention, I prefer Terraform.DevOps.IT-CompanyName)

Finally, if you have existing .tf files that already exist, create a branch called dev (git checkout -b dev), and then commit and push the files up to the repo.

Step 2: Use Azure Blob Storage as the backend provider to store your state files

Since the alternative is hosting a state file on a single location (like your laptop), which may die/be stolen/other bad things, we want to tell Terraform to keep its state files in the cloud for easy and shared access.

Open up the Azure portal and browse to the Storage Accounts section. Create a new Storage Account and use the default options. Unless your Terraform environment is absolutely massive (e.g. greater than 1 gigabyte of .tf files), this account will use next to no storage, certainly less than the threshold to be charged $0.01 per month. Once the the Storage Account is created, you’ll want to add the appropriate backend type to your Terraform library as defined here, and then run terraform init which will initialize the Storage Account to host your state file.

Step 3: Begin creating the Build pipeline

Now that we have our state stored in Azure Blob Storage, we can begin working on the actual pipeline. This is where the automation actually happens and you’ll see how much easier it makes things.

  1. Go back to Azure DevOps, under the Pipelines section on the left-hand panel, select ‘Builds’, and then click the ‘New pipeline’ button.
  2. When the window opens, choose the Classic editor link near the bottom.
  3. Select the correct project, repository, and branch (dev)
  4. Choose to start with an Empty job

Now that you’ve got a skeleton of a build, here’s where we actually start adding the steps to create a fully functional Terraform setup.

Next to ‘Agent job 1’, click the + sign. In the search field, enter Terraform, select the entry called ‘Terraform Build & Release Tasks’ by Charles Zipp, and then select ‘Get it free’. Follow the prompts and install it on DevOps (it doesn’t cost anything). Once it’s installed, just refresh the build page and click the + sign and search for Terraform again.

For each pipeline we build, we’ll first need to install the Terraform tools, so add a ‘Terraform Installer’ task. Select the new task and ensure that the version of Terraform being installed is exactly the same as the version you’re using on your local machine, so edit the ‘Version’ field and replace it with the correct value.

Add another task, so search for Terraform again, but this time select ‘Terraform CLI’. It will probably default to ‘terraform validate’ so select it so that we can update a few things. Since the first thing we need to do is initialize the environment by downloading all of the necessary providers, change the Display Name to something more descriptive like terraform init and change the Command dropdown to ‘init’. Leave everything else the same.

Add another Terraform CLI task, this time for the validation step. This will make sure that before we try to do anything with our .tf files that they’re valid and the syntax is correct. Since validate is the default option for the Terraform CLI task, we’re done with this one.

Add another task, but this time, since the Terraform add-on doesn’t support the workspace command (as of writing), we need to add a shellexec task, so search for that string and add it. Change the Display Name and Code to both read terraform workspace select dev.

You guessed it, add another Terraform CLI task for the plan. Update the Display Name to read terraform plan and then select plan from the Command dropdown. Here’s where we get to define the environment, so in the ‘Command Options’ text field, enter -var-file=environments/dev.tfvars -out=TFPlan -input=False This tells Terraform to use the variables that you’ve specified for the DEV environment, write the output to a file called TFPlan and don’t expect any input.

Almost there.

Add another task, this time searching for copy and selecting Microsoft’s Copy Files task. For the Source Folder, enter $(Build.SourcesDirectory) and for the Target Folder, enter $(Build.ArtifactStagingDirectory)

For the final task, search for publish and select Microsoft’s Publish Build Artifacts. The defaults here are fine.

All you have to do to finish up is click the dropdown arrow next to ‘Save and queue’ and select ‘Save’, accept the defaults, and click ‘Save’ again.

You now have a working DevOps Build Pipeline!

In the next part, we’ll take the artifacts published from this Build pipeline and use them to create a Release pipeline that will actually tell Terraform to make your changes.

The XY Problem and how to avoid it

What is the XY Problem?

The XY Problem is asking about your attempted solution rather than your actual problem. This leads to enormous amounts of wasted time and energy, both on the part of people asking for help, and on the part of those providing help.

http://xyproblem.info/

Or to put it concisely, it’s when someone asks about their attempted solution, rather than about their problem.

I’m going to exaggerate to illustrate an example here, because I want to make it crystal clear what’s going on here.

Let’s say you work at a carving factory, and you’re in charge of all of the tools that the carvers need to do their job. When a tool breaks, a carver comes to you for a replacement. And let’s also say that each carver is tasked with taking a 10′ x 10′ x 10′ block of stone and reducing it down to a 1.5′ x 1.5′ x 1.5′ smooth cube.

A carver comes up to you and says “I need a new chisel, mine broke”. So you happily provide her a new chisel, satisfied that you’ve done your job and that she’s doing hers. Except, the next day, the same carver comes back and says “I need a new chisel, mine broke” again, so with a bit of a puzzled look on your face you go get her another chisel. The next day, same carver, same request again. And the day after. And the day after that.

You also notice that some carvers have completed two full 10′-cubed-to-1.5′-cube stones this week, while the one who has burned through 5 chisels has barely made any progress. The other carvers have been using diamond-tipped saws and jack hammers and other heavy duty tools in addition to the chisels, depending on how much material they were trying to remove at a particular time.

This is the XY Problem. The carver has spent so much time focusing on what she thought was the solution (“just keep getting a new chisel every time mine breaks”), instead of asking what the actual problem was (“how do I remove a lot of material in the most efficient way possible?”) and in doing so wasted her time and the company’s resources.

In a support-based industry like IT, this can be a depressingly common thing to deal with. Many times, the people you are tasked with supporting learned to do a particular task one way, and so just kept on doing that same task over and over without ever questioning why it was done that way, or if it was ever even necessary in the first place.

It’s completely understandable too – change is hard, and maybe they even got really efficient at doing that task that way, but that doesn’t mean there isn’t a better way to do it, or that it’s even necessary to do at all.

One of the most valuable skills I’ve learned is to pick up on when someone’s presenting me with an XY Problem; where they’ve started describing how they want me to solve a problem, rather than telling me what the end goal they’re trying to accomplish is and what they’ve tried in the process. I don’t call the person out because that does nothing but put the requestor in a defensive posture (completely unhelpful) and put me in a frustrated here-just-let-me-do-it state; I just try to get at the desired end state by probing and then re-framing their thinking in terms of what I would do if I were in their shoes. And by doing so, you align yourself with the other person making them much more likely to take your advice.

Using Terraform workspaces for fun and profit – Part 1

We are a fairly small company (~350 employees) and a very small cloud team (myself and one other guy), so making use of automation where it’s cheap or free is imperative if we don’t want to get overwhelmed with the amount of work being thrown our way. One major challenge we faced was that for compliance reasons, we needed to have separate environments for development, QA, and production, but at the same time minimize the amount of time it takes to promote successful projects from that same development environment to QA, and then eventually from QA to prod.

This is the path we took.

First, we created Terraform workspaces for each one:

$ terraform workspace create dev
$ terraform workspace create qa
$ terraform workspace create prod

Next, we created Azure subscriptions for each environment in PowerShell:

Install-Module Az.Subscription -AllowPrerelease
Get-AzEnrollmentAccount
New-AzSubscription -OfferType MS-AZR-0148P -Name "IT.TechServices.DEV" -EnrollmentAccountObjectId <enrollmentAccountObjectId>
New-AzSubscription -OfferType MS-AZR-0148P -Name "IT.TechServices.QA" -EnrollmentAccountObjectId <enrollmentAccountObjectId>
New-AzSubscription -OfferType MS-AZR-0017P -Name "IT.TechServices.PRD" -EnrollmentAccountObjectId <enrollmentAccountObjectId>

Note: The OfferType property is different between the DEV/QA subscriptions and the PRD subscription. This is because as part of our Enterprise Agreement with Microsoft, we have access to separate Dev/Test subscription pricing on the condition that we don’t run any production workloads in it. I am not sure if these values are universal, so if they don’t work for you, please check with your MS rep for the correct ones.

Then we created a sub-folder called environments, and in that folder we created Terraform variable files for each respective environment (dev.tfvars, qa.tfvars, and prod.tfvars), containing the appropriate Azure subscription ID in a variable called subscription_id and the name of the environment in a variable called environment_name. We then created a main.tf file with the contents variable subscription_id {} and variable environment_name {}. So, for example, in dev.tfvars we would have subscription_id = "109b6c11-e163-477e-8453-7613249447c" and environment_name = "dev" and in qa.tfvars we would have subscription_id = "95958666-8ab6-3980-828a-23f7382b9c5a" and environment_name = "qa"

/terraform    
 |    
 +--+ /environments    
 |         |    
 |         +--+ dev.tfvars
 |         +--+ qa.tfvars    
 |         +--+ prod.tfvars    
 | 
 +--+ main.tf

OK, let’s say we wanted to create some service called example-service and which required a resource group to start placing components in. If we wanted to do that in the Azure Canada Central region with some descriptive tags for sorting and billing, we would do the following:

resource "azurerm_resource_group" "example-service" {
   name     = "example-service-${var.environment_name}"
   location = "canadacentral"
   tags = {
     environment = "${var.environment_name}"
     owner       = "justin.smith"
     product     = "example-product"
     department  = "tech.services"
   }
 }

We’re using the ${var.environment_name} placeholder which means that we only have to create a single .tf file for each resource, and it will be named according to the environment we specify.

Finally, we have created a git repo for our entire Terraform collection, and have created 3 branches within it: dev, qa, and prod, with each one having successively more restrictions on committing to it than the previous. We’ve also setup Azure DevOps build and release pipelines which are triggered each time code is committed to the respective branch.

For example, anyone on our team can deploy changes to the dev branch because we don’t really care what happens in it. With the qa branch, it requires at least one of either myself or my co-worker to approve the commit. We want to make sure that people aren’t just adding unnecessary resources to QA, but it’s still not “live” so restrictions are relaxed somewhat. The prod branch requires the change ticket number from our ticketing system to be present in the Pull Request before being approved, ensuring that it has gone through the appropriate Change Approval Process before it becomes part of our daily operational management routine.

Now we’re ready to deploy! (Please note that the steps below simply replicate our pipelines in a manual way and will work for this demo. It is considered best practice to automate these steps once you’re comfortable with the process.)

First, we ensure we’re in the dev workspace:

$ terraform workspace select dev

Then, to test to make sure that we’re not complete idiots, we run the Terraform plan, including the appropriate environment’s variable file:

$ terraform plan -var-file=environments/dev.tfvars

And assuming nothing is screaming, we apply it:

$ terraform apply -var-file=environments/dev.tfvars

We now have a resource group in our Azure DEV subscription that we can use to deploy resources into, and named example-service-dev!

Now let’s say we perform the various development tasks like creating the other resources via Terraform and we’ve applied them using the same method as above (using the ${var.environment_name} placeholder at the end of each resource name), and we’re happy with how things look in the development environment. All we have to do is switch to the qa workspace and plan/apply it, using the QA variable file:

$ terraform workspace select qa
$ terraform plan -var-file=environments/qa.tfvars
$ terraform apply -var-file=environments/qa.tfvars

Now you’ve got example-service-qa all set up and ready to go.

And finally, once your QA team has validated and tested the service, you just run it one more time, this time using the prod environment settings:

$ terraform workspace select prod
$ terraform plan -var-file=environments/prod.tfvars
$ terraform apply -var-file=environments/prod.tfvars

And assuming that between each promotion (dev-to-qa, and qa-to-prod), the Terraform files were committed and promoted correctly, you should now have 3 fully functional environments (example-service-dev, example-service-qa, and example-service-prod) with only one set of Terraform files!

In the next part, we’ll walk through building an Azure DevOps Build pipeline to begin automating the deployment so that we don’t need to be so hands-on every time a Terraform change is made!

Posts navigation