Terraforming Mars by Daein Ballard
Terraforming Mars by Daein Ballard

Automating Monitoring & Alerting Infrastructure with Terraform

At iLert we embrace infrastructure as code and try to automate our processes whereever possible. This might reach from niftly little bash scripts to fully blown Terraform projects that spin up whole environments with as little as terraform apply on a CLI.

With Hashicorp’s Terraform you can make use of infrastructure as code to provision and manage any cloud, infrastructure, or service. Terraform can be extended to use third party services with the help of Terraform Providers hosted in the Terraform Registry.

Lets see how we can use the Grafana Terraform Provider and iLert Terraform Provider to setup an automated metrics alert which will trigger a phone call alert during support hours.

Content

  1. Requirements
  2. Grafana & Prometheus setup
  3. iLert setup
  4. Terraform setup
  5. Understanding Terraform
  6. Automating infrastructure
  7. Taking this further

Requirements

You will need:

Grafana setup instructions

Note: in case you already have a running Grafana instance ready to go, you can skip this step.

However if you do not have one handy and you want to quickly explore your options, we have provided you with a docker-compose setup, that you can use to spawn an instance quickly.

  • clone our sample repository git clone git@github.com:iLert/terraform-grafana-alerting-sample.git
  • cd terraform-grafana-alerting-sample
  • run docker-compose up
  • your Grafana instance shold be running at http://localhost:3000

iLert setup instructions

Usually you would setup your iLert users and notification settings in iLert directly or through SSO providers in larger applications. However for the sake of this Terraform showcase, we will create all resources e2e in Terraform including the user and his settings.

Terraform setup instructions

installing hashicorp terraform

Lets install Terraform first you can grab a copy here or install with tools like brew e.g. brew install terraform.

Verify your proper installation with terraform -v in your shell. You should see something like this: Terraform v0.13.5.

Understanding Terraform

First of grab the source code for this post if you haven’t already: clone our sample repository git clone git@github.com:iLert/terraform-grafana-alerting-sample.git and navigate into it cd terraform-grafana-alerting-sample.

You will see the following files and folders:

Providers

providers.tf

This file describes the required providers for our setup, as well as maps their required variables e.g. credentials to access Grafana or iLert.

required_providers {
    ilert = {
        source  = "iLert/ilert"
        version = "~> 1.1.3"
    }
}

Resources

grafana.tf
ilert.tf

These files describe the resources of the third party services e.g. the grafana alert or the iLert alert source that will be managed for our alert to create incidents.

resource "ilert_alert_source" "grafana" {
  name                   = "Grafana Integration"
  integration_type       = "GRAFANA"
  escalation_policy      = ilert_escalation_policy.grafana.id
  incident_priority_rule = "HIGH_DURING_SUPPORT_HOURS"
}

Variables

variables.tf

This file holds all of our variables which are needed to setup or resources.

variable "ilert_user_mobile_number" {
  description = "The iLert user mobile to create"
  type        = string
  default     = "+491234567890"
}

Docker-compose files

setup/
docker-compose.yaml

We have provided these in case you have no running Grafana instance handy, these are otherwise not required and are not related to Terraform.

Automating infrastructure

Let’s see how we can roll out our infrastructure.

Preparing project

Before we can apply the wanted changes to the services, we have to initialize our Terraform project. This will prepare Terraform e.g. fetch all of our declared providers (Grafana and iLert), as well as prevalidate the syntax of our provider files.

Simply run terraform init and you should see an output like this:

terraform init output

Applying changes to services

For our sample we have additionally configured some handy environment variables for you to pass on the dynamic arguments even more flexible (make sure to change them to your needs):

# adjust according to your grafana instance (these are the defaults for our docker-compose)
export GRAFANA_URL="http://127.0.0.1:3000"
export GRAFANA_AUTH="admin:admin"

export ILERT_ORGANIZATION="your-ilert-tenant"
export ILERT_USERNAME="your-ilert-username"
export ILERT_PASSWORD="your-ilert-user-password"

# we provide additional overrides for our variables.tf right from the cli for your convenience, simply adjust to your phone number
terraform apply \
    -var 'ilert_user_email=example@example.com' \
    -var 'ilert_user_username=example' \
    -var 'ilert_user_mobile_code=DE' \
    -var 'ilert_user_mobile_number=+4915231062570'

You should see cli output like this:

terraform apply output

Trigger test alert

You are now able to trigger a test alert in your Grafana instance and in-case you are in your support hours (check ilert.tf for this, default is Europe/Berlin Mo-Fr 8am-5pm) you should receive a phone call with your incident information on the provided number.

Reverting all changes

This actually illustrates the greatness of infrastructure as a code, especially during early stage environment prototyping. With a single command, we can remove all resources and start clean.

All by running terraform destroy, your cli output should look like this:

terraform apply output

Taking this further

The iLert Terraform Provider offers more resources e.g. Connections or Connectorst that can be managed as well.

Additionally you should always ensure that the Terraform project “state” is stored in an encrypted bucket, currently the state is stored locally with your project (dropped by .gitignore). However the state contains credentials and locks, the later should be hosted in the cloud to provide shared functionallity across teams - take a look at the official docs on Terraform State for more information.