How We Ship Serverless
7 Minute ReadOver the past month I have been building a new serverless application delivery process for Addresscloud. Whilst I initially viewed this work as reducing technical debt and enabling faster development, I have now come to view it as core to our success as a SaaS business. This is because we have taken a serverless approach to managing scale - a design pattern which is at the core of how we operate as a market leader for addressing and geographic intelligence. In serverless, infrastructure changes are synonymous with code commits, the complexity and frequency of which prompted us to to move from a semi-manual deployment process to a fully-automated one. Using GitHub's new Actions workflows we were able to integrate Terraform Cloud's APIs to create a robust continuous deployment workflow to ship our serverless applications.
Requirements
In serverless architectures software is tightly coupled to cloud infrastructure - code changes are infrastructure updates and infrastructure updates are code changes.
With a growing number of serverless services powering Addresscloud, we recognised early-on the need for an infrastructure-as-code (IaC) approach to create and manage services. In March I was grateful to accept the opportunity to present to Simply Business developers about our serverless design patterns, and how we use Terraform to implement IaC at Addresscloud. In the following question and answer session we had an great discussion about the challenges of manually provisioning state-locking and change management for Terraform workspaces. Since that presentation HashiCorp have released Terraform Cloud, a SaaS for managing and running infrastructure created using Terraform. As Terraform users, migrating our Terraform state storage to Terraform Cloud was a natural progression for our serverless stack.
Terraform Cloud (TFC) provides user-level approval and logging of infrastructure changes, which is really useful for rolling-back updates and auditing system state. Changes to infrastructure are made within a container environment, managed by TFC, which uses the standard Terraform plan and apply process. Environment variables for infrastructure builds can be set using the TFC interface, which means that combined with Terragrunt-style organisation of Terraform modules, infrastructure changes for different deployments (e.g. dev and prod) are performed using isolated and consistent build environments.
The advantage of this approach is that infrastructure change management can be handed-off to TFC, removing the need to manage deployments locally, or manually install Terraform in our continuous integration environment. This reduces workflow complexity and decreases the risk of human error during deployment.
Terraform Cloud can integrate directly with version control systems (VCS; e.g. GitHub). In this mode TFC triggers infrastructure updates using repository webhooks (e.g. code commits). Whilst this might work for repositories which manage traditional cloud infrastructure, it fails for serverless architectures where logic and infrastructure code exist as one, and testing needs to be performed before infrastructure can be safely deployed.
Terraform Cloud's VCS integration also restricts local infrastructure updates (i.e. code must be pushed to GitHub), as the repository is acting as a single source of truth for the infrastructure. However, on the day I started to write this post GitHub's notifications service had been degraded for ~6 hours meaning that I could no longer trigger build events using repository activity webhooks. To manage future GitHub outages we need a failsafe so that we are able to manually push infrastructure code changes from locally-executed Terraform plans. In summary, we identified the following requirements for our Serverless deployment workflow:
- Run code tests that are bundled with infrastructure code
- Once tests completed, hand-off to Terraform Cloud for infrastructure updates
- Support one-off manual updates via the CLI (failsafe)
A Solution
HashiCorp provide an API for Terraform Cloud, which can be used to execute infrastructure updates based on pushed code. The official documentation provide a rudimentary set of Bash scripts using cURL to make API requests from the command line. Around the same time as starting to explore TFC, GitHub added us to their beta-trial of their new Actions tooling. GitHub Actions enables scripts and custom workflows to be run in GitHub containers in response to code changes.
What if we could run our logic tests in GitHub, and when successful hand-off our code to Terraform Cloud to manage the required infrastructure updates?
GitHub enables user-defined Actions to be created and published to perform custom operations. HashiCorp's previous GitHub Action was created in the early version of GitHub Actions, and is now deprecated. As a result, we created a custom action to push Terraform jobs from within a GitHub workflow. The Addresscloud Terraform Cloud Action hands-off infrastructure changes to TFC once a user's tests have completed. Crucially, because the Action has access to the Git repository, build artefacts can be passed to Terraform so that TFC can ship Lambda function code changes as infrastructure updates. Once the Action has successfully handed-off to TFC, the GitHub workflow completes. TFC then executes a plan and apply workflow, with optional user confirmation before deployment. TFC comes with Slack notifications out-the-box to keep the team notified of workflow progress.
Using this Action it is fairly simple for a developer to setup a GitHub workflow to run tests, build, and ship to TFC (see listing below). We've been testing the Action over the past month and are now in the process of rolling-it out across all our serverless applications, aiming for a consistent build, test, deploy workflow for Addresscloud in 2020.
name: Test & Deploy Dev
on:
push:
branches:
- dev
jobs:
deploy-lambda:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Set up node
uses: actions/setup-node@v1
with:
node-version: 10.x
- name: NPM install
run: npm ci
- name: Test
run: npm run test
- name: Build
run: npm run build
- name: Create tar gz file for TFC
run: tar --exclude *.terraform* -zcvf build.tar.gz build infrastructure
- name: Send run to Terraform Cloud
uses: addresscloud/terraform-cloud-action@v1.0.0
with:
tfToken: ${{ secrets.TERRAFORM_TOKEN }}
tfOrg: '[Organisation]'
tfWorkspace: 'hello-world-terraform'
filePath: './build.tar.gz'
identifier: ${{ github.sha }}
Example GitHub Action workflow, sending a Lambda package to TFC for deployment once tests have passed.
Conclusions
Combining GitHub Actions and Terraform Cloud made it possible to build a continuous testing and delivery workflow to ship our serverless apps. This approach lets the GitHub Action and TFC manage the complexity of the build-deployment process, simplifying the developer workflow. The end result is that we can ship code improvements faster, delivering new features to our customers whilst reducing the risk of introducing build errors or deploying unstable code. An interesting future challenge is how we can trigger post-deployment integration tests, and whether we can reverse the pipeline to automatically roll-back unstable deployments.
Addresscloud's Terraform Cloud Action is published under an open source license on the GitHub Marketplace.
Addresscloud in Action
See how Addresscloud services are helping our customers solve real-world challenges.