Deploying Packer Images with Dynamic Secrets from HashiCorp Vault

January 11, 2019 Gabe Maentz

Using Packer to automate the build process for machine images is awesome. In fact I have found great success in automating image builds across many environments including VMware, AWS, Azure & GCP, but one thing has always been a bit of a concern for me…..storing the credentials. Of course each environment requires a different set of credentials and to easily access them I typically store them as environment variables or within a local file on my computer. In fact, doing something like this for my AWS deployments has been the norm.

$ export AWS_ACCESS_KEY_ID="awsaccesskey"
$ export AWS_SECRET_ACCESS_KEY="awssecretkey"

A better way

As the number of environments and set of credentials has been growing, I have found a need for storing and accessing them in a better way. Ideally I am looking to store credentials in a central location, look them up when needed for deployment and rotate them when done with the job at hand. I would like the ability to create a set of rotating credentials that can be called and used within my Packer automation workflows.

Enter Vault

HashiCorp, the creators of Packer, also have a secrets management product called Vault. Vault secures, stores, and tightly controls access to tokens, passwords, certificates, API keys, and other secrets in modern computing. Vault supports my idea of rotating credentials or as they call them ‘dynamic secrets’ to minimize the length of time a set of credentials should exist. Vault supports a number of secrets engines to ease integration including AWS, Azure, GCP, Databases and Active Directory. (There are a ton of other engines available, but these will get me started).

In this case, we will utilize Vault’s AWS secrets engine to generate dynamic, on-demand AWS access credentials for my Packer AMI builds.

Enabling The AWS Secrets Engine in Vault

The AWS secrets engine can be enabled via the Vault command line our UI. Below is a quick video capturing the steps using the UI.

Enable the AWS Secrets Engine
Configure the AWS Secrets Engine by providing AWS credentials that provide Vault the ability to dynamically manage IAM Users.
Create a ‘Packer’ role and specify the minimum set of permissions Packer needs to build AMIs. This is done by attaching the appropriate policy to the Packer role, which can be cut/past from the policy outlined in Packer’s AWS AMI builder documentation.
Generate dynamic credentials in AWS that will provide access to Packer to generate an AMI.
Revoke the credentials at any time.

If so inclined, these same series of steps can be performed via the Vault command line.

Build Your AMI Using PACKER With Dynamic AWS Credentials

Now that we have the Vault and AWS integration working we are ready to utilize the dynamic credentials in our Packer build. We will leverage Packer’s user variables to pass in our dynamic secrets by means of a variable file called awskeys.json.

Generate a set of dynamic credentials for Packer within Vault.

Save credentials to a variable file called awskeys.json. Vault provides these in JSON format which is what Packer expects.

  "accessKey": "AKIAIMQUVKMCSRB5NEZA",

  "secretKey": "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",

  "leaseId": "aws/creds/packer/1pf3nMh8KEEJtQQGADFtGYAE"

Specify the correct variable names in the packer build template. The builder stanza of my packer template for creating a vBrisket AMI looks like:

{
  "builders": [{
    "access_key": "{{ user `accessKey`}}",
    "secret_key": "{{ user `secretKey`}}",
    "type": "amazon-ebs",
    "region": "us-east-1",
    "source_ami": "ami-0f9cf087c1f27d9b1",
    "instance_type": "t2.medium",
    "ssh_username": "ubuntu",
    "ami_name": "vbrisket-image {{timestamp}}",
    "ami_description": "vBrisket Image",
    "ami_groups": ["all"]
  }]

Run a Packer build specifying the variable and template file:

packer build -var-file=awskeys.json vbrisket.json

Vault will automatically expire or rotate these credentials based on their lease time so there is no harm for me to show you my credentials as they will be revoked and become utterly useless after they have served their purpose. This workflow can be further automated utilizing consul-template and/or envconsul but that is a subject for a different post. Happy building (securely).

Building the Fleet in Azure with Terraform

October 9, 2018 Gabe Maentz

In the series of Terraform posts we have shown how to effectively utilize Infrastructure as Code to build, deploy, scale, monitor and destroy a fleet of infrastructure across multiple regions in AWS. The beauty of Terraform is that while we may have used it to build out infrastructure in AWS, we can also extend it’s use to other cloud providers as well. As I see more and more organizations adopting a multi-cloud strategy, let’s take a look at what it would take to deploy our fleet into Azure.

Azure Specifics

If you are familiar with AWS, Azure provides many similar services and features. The Azure Terraform provider is used to interact with many of the Azure resources supported by Azure Resource Manager (AzureRM). A brief overview of the Azure resources will will utilize to move our fleet to Azure are:

Azure Authentication: Terraform supports authenticating to Azure through a Service Principal or the Azure CLI. A Service Principal is an application within Azure Active Directory whose authentication tokens can be used as the client_id, client_secret, and tenant_id fields needed by Terraform. Full details to create a service principle are well documented on the Terraform website.

Resource Group: Azure holds related resources for a given solution in a logical container called a Resource Group. You cannot deploy resources into Azure without assigning them to a Resource Group which we will create and manage via the Terraform Azure provider.

Virtual Network: Akin to a AWS VPC, Azure’s Virtual Network provides an isolated, private environment in the cloud. It is here where we will define our IP address range, subnets route tables and network gateways. This build will utilize the Azure network module maintained in the Terraform module registry.

Scalability: In order to scale our fleet to the appropriate size, Azure provides Azure’s Virtual Machine Scaling Set (VMSS). AWSS is similar to AWS Auto Scaling allowing us to create and manage a group of identical, load balanced, and autoscaling VMs. The fleet will be front ended by a load balancer so that we can grow/shrink without disruption and will utilize VMSS module up on my GitHub terraform_azure repository.

Deploy to Azure

For our initial Azure deployment will will create a new set of Terraform files, including a new main.tf to tie together the details of the Azure provider, modules and specifics for how we want to the fleet built. Inside the file we have declared our connection to Azure, the resource group to build, the virtual network details as well as the web server cluster. The VMSS module referenced also builds a jump server/bastion server in the event that you need to connect to the environment to do some troubleshooting. I have specified my Azure credentials as environment variables so that they are not included in this file.

	# Connect to Azure
	# Authenticate with Azure and Create a Resource Group
	# Set through CLI or env variables -
	# How To: https://www.terraform.io/docs/providers/azurerm/authenticating_via_service_principal.html

	# Configure the Azure Provider
	provider "azurerm" {}

	# Create a resource group
	resource "azurerm_resource_group" "network" {
	name = "devtest"
	location = "eastus2"
	}

	# Build Virtual Network
	module "network" {
	source = "Azure/network/azurerm"
	resource_group_name = "${azurerm_resource_group.network.name}"
	location = "${azurerm_resource_group.network.location}"
	address_space = "10.0.0.0/16"
	subnet_prefixes = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
	subnet_names = ["subnet1", "subnet2", "subnet3"]

	tags = {
	owner = "user"
	environment = "dev-environment"
	}
	}

	# Deploy the Fleet
	module "webserver_cluster" {
	source = "github.com/gmaentz/terraform_azure/modules/vmss"
	location = "${azurerm_resource_group.network.location}"
	resource_group_name = "${azurerm_resource_group.network.name}"
	virtual_network_name = "${module.network.vnet_name}"
	subnet_id = "${module.network.vnet_subnets[0]}"
	application_port = 80
	admin_user = "azureuser"
	admin_password = "AzureAdminP@ssword1"
	cluster_name = "webserver-dev"
	cluster_size = "2"
	instance_type = "Standard_D1_v2"
	cloud_config_file = "web.conf"
	tags = {
	owner = "user"
	environment = "dev-environment"
	}
	}

view raw azure_terraform_main hosted with ❤ by GitHub

All files to create this fleet in Azure including the main.tf, output.tf and VMSS module are available in the terraform_azure repository of my GitHub account.

We can initialize, plan and apply our deployment using Terraform and immediately see our Azure resources being built out through the Azure Portal and inside our devtest resource group.

Once the deployment is complete, we browse to the DNS name assigned to to the load balancer front ending our Azure VMSS group. This address displayed at the end of the Terraform output, as we included an output.tf file to list relevant information.

Browsing to the DNS name we can validate our deployment is now completed. At this time we can check the health of the deployment - remember there is a jump server that is accessible if needed.

Once you are happy with the state of the new fleet in Azure it can be torn down with a terraform destroy. I recommend doing this as we prepare for the next step in the series: Moving the Fleet from AWS to Azure.

This is part of a Terraform series in which we have covered:

Remembering to Clean Up with Terraform

October 1, 2018 Gabe Maentz

One of my favorite uses of Terraform is to quickly turn up an infrastructure environment with only a few lines of code. Of equal importance is the ability to tear down parts of the environment when they are no longer needed or need to be rebuilt. Terraform helps me leverage elasticity both in building, destroying and rebuilding as necessary.

Reminders

If you are like me, you tend to forget things and need reminders. I have been building out environments now for some time in an automated way, but I am not always the best at remembering to tear them down when I am done. Don’t get me wrong, the act of tearing things down is easy with commands like terraform destroy, but remembering to do so is where I have a gap.

To close that gap I wanted to create a monitoring and trigger mechanism that would remind me when my infrastructure is running idle, and to go clean it up. Since many of my deployments are in AWS, the two tools I will leverage to accomplish this are CloudWatch and SNS. For those not familiar, CloudWatch is a monitoring and management service provided by Amazon that provides operational metrics on the health of a given environment. SNS is a notification service that allows you to send messages to a variety of endpoints - including SMS text messages which is a great way to remind me of doing things.

Incorporating Monitoring into My Build

Defining CloudWatch and SNS is relatively easy in Terraform as both resources can be defined using the Terraform AWS provider. Examples for both can be found on the Terraform website, and I have folded them both into a module I created on GitHub.

We will use these resources to monitor when the our autoscaling group goes idle, which I define as less then 2% CPU every minute for 5 minutes. When that occurs send a text message to the supplied phone number. To keep it simple the module accepts both the autoscaling group to monitor and the phone number to send messages to as variables. There is nothing preventing us from also defining the thresholds and polling intervals as variables as well, and in fact is something that we should probably do in the future to make the module more robust.

	variable "autoscaling_group" {
	description = "The auto scaling group that should be monitored."
	}

	variable "sms_number" {
	description = "The number in which to send SMS text messages for Alerts, in format +14126552983"
	}

	resource "aws_sns_topic" "send_text" {
	name = "sendText"
	}

	resource "aws_sns_topic_subscription" "text_send_text_target" {
	topic_arn = "${aws_sns_topic.send_text.arn}"
	protocol = "sms"
	endpoint = "${var.sms_number}"
	}

	resource "aws_cloudwatch_metric_alarm" "alarm_minutes" {
	alarm_name = "terraform-idle_cpu_5_mins"
	comparison_operator = "LessThanOrEqualToThreshold"
	evaluation_periods = "5"
	metric_name = "CPUUtilization"
	namespace = "AWS/EC2"
	period = "60"
	statistic = "Average"
	threshold = "2"

	dimensions {
	AutoScalingGroupName = "${var.autoscaling_group}"
	}

	alarm_description = "This metric monitors ec2 cpu utilization every minute for 5 minutes"
	alarm_actions = ["${aws_sns_topic.send_text.arn}"]
	}
	resource "aws_cloudwatch_metric_alarm" "alarm_hours" {
	alarm_name = "terraform-idle_cpu_5_hours"
	comparison_operator = "LessThanOrEqualToThreshold"
	evaluation_periods = "5"
	metric_name = "CPUUtilization"
	namespace = "AWS/EC2"
	period = "3600"
	statistic = "Average"
	threshold = "2"

	dimensions {
	AutoScalingGroupName = "${var.autoscaling_group}"
	}

	alarm_description = "This metric monitors ec2 cpu utilization every hour for 5 hours"
	alarm_actions = ["${aws_sns_topic.send_text.arn}"]
	}

view raw cloud-watch.tf hosted with ❤ by GitHub

Using the Cloud-Watch Module

To make use of this module, we simply need to edit the main.tf file we have been using in development to include the cloud-watch module, which we will call from GitHub. We will pass the name of the auto scaling group created within the webserver_cluster module as an input for monitoring and prompt for the phone number to send the alert message to.

	variable "sms_number" {
	description = "The number in which to send SMS text messages for Alerts, in format +14126552983"
	}

	module "cloud_watch" {
	source = "github.com/gmaentz/terraform/modules/services/cloud-watch"
	sms_number = "${var.sms_number}"
	autoscaling_group = "${module.webserver_cluster.asg_name}"
	}

view raw cloud-watch snippet.tf hosted with ❤ by GitHub

Now when we deploy our fleet there will be a two cloud watch alarms created against the deployed auto-scaling group. One that will report on idle time in a 5 minute window, and the other reporting on idle time in a 5 hour window. The idea being that if I missed one text message, I will get the second so that I can perform a terraform destroy to tear down the environment when it is not being utilized.

Now that I have included the cloud-watch module to my development main.tf file let’s initialize (terraform init), plan (terraform plan), and deploy (terraform apply).

Notification and Clean UP

I can see that it successfully created my alarm in CloudWatch and tied it to the auto-scaling group it created when deploying the fleet.

Output from running a terraform apply, listing the DNS name and autoscaling group of the sever fleet.

CloudWatch Alarm - Two were created, one for 5 minute intervals and the other for 5 hour intervals.

Now when the environment goes idle, an alarm will trigger and send me a text message. Should I not take care of it at that time, another text message in 5 hours will be send should the environment remain idle.

Text Message from AWS SNS notifying me that my auto-scaling group has had idle CPU for the last 5 minutes.

Since Terraform makes it easy to cleanup (terraform destroy), I will be sure to perform that step to not incur costs for unused assets and environments. Terraform destroy will be sure to cleanup not only the environment it deployed but also the alarms and SNS notifications it created during buildout.

This is part of a Terraform series in which we have covered:

Creating A CloudMapper Virtual Appliance using Packer

September 11, 2018 Gabe Maentz

One of my favorite visualization tools for diagraming Amazon Web Services (AWS) environments is Duo CloudMapper. CloudMapper helps you understand visually what exists in your AWS accounts by running a collection against the environment and providing an interactive web page. This is extremely handy for identifying possible network misconfigurations, along with a slew of other benefits. For a full listing why I like this tool check out my post on How to Visualize Your Cloud Deployments with CloudMapper.

Despite it’s power, one of the challenges I have found is to simply get it started and working. CloudMapper is open source built upon other open source products and I have found that there are inevitably build and dependency issues that suck up my time before I can simply use the tool. For these reasons and to make things easier in general, I chose to create and deploy CloudMapper as virtual appliance.

Building the Virtual Appliance

I utilized Packer to provision my CloudMapper virtual appliance. Packer is excellent for creating machine images for multiple platforms from a single source configuration. In this case we will build out an Amazon Machine Image (AMI) with Packer, which will take care of all package installation and dependencies for the build out. You can learn more about all the Packer goodness on the HashiCorp website and Paul Kirby provides a nice overview in his Packer PluralSight course.

Install Packer
Download the cloudmapper.packer template from my GitHub account. (Packer templates are simply JSON files that specify the various components used to create the machine image, and where the build of the image will be saved. In our case we will be creating and deploying our virtual appliance into AWS, but Packer comes with support to build images for Amazon EC2, CloudStack, DigitalOcean, Docker, Google Compute Engine, Microsoft Azure, QEMU, VirtualBox, VMware, and more.)
Specify AWS Credentials for creating our virtual appliance. There are a number of ways to accomplish this but we will use environment variables.
```
   $ export AWS_ACCESS_KEY_ID="awsaccesskey" 
```
```
   $ export AWS_SECRET_ACCESS_KEY="awsecretkey"
```

Build the image.

$ packer build -var aws_region="us-west-2" -var ami_id="ami-6cd6f714" -var python_version="3.5.6" cloudmapper.packer

    # aws_region is where the image will be stored.

    # ami_id is the base Amazon Linux image in the region.

    # python version of your choice.

There are currently some issues with CloudMapper and Python 3.7, so I am using the recommend version of 3.5.6

The build process will take ~10-15 minutes as it needs to compile and pull down all of the components. Once it is complete, Packer will notify of your unique AMI that can now be used for deployment.

Deploying the Virtual Appliance

Now that the image for our virtual appliance is available in AWS, let deploy it and run CloudMapper. My preferred way to deploy would be using Terraform but for purposes of this post we will step through the manual steps.

Launch an instance using the newly created CloudMapper image. You can accept the defaults providing your instance a public IP with SSH access.

Configure CloudMapper by logging in via SSH and performing the final initialization steps. (While these could be automated and built into the image, I get sensitive about saving AWS credentials anywhere even if my image is private. I prefer to specify them when needed.)

```
$ aws configure
```
You can specify a full access account to run CloudMapper but I like least privilege so have setup a “Visualization” IAM user with the privileges specified in the CloudMapper readme.

Configure CloudMapper’s account information in the config.json file to match aws credentials:

$ cd ~/cloudmapper

$ pipenv run python3 cloudmapper.py configure add-account --config-file config.json --name AWS_USERNAME --id AWS_ACCESS_KEY_ID

   #AWS_USERNAME is “friendly name” tied to IAM account

   #AWS_ACCESS_KEY_ID is the AWS Access Key ID specified in aws configure.

Run CloudMapper’s collection against the environment. The collection phase can take some time, as it is truly pulling all the metadata information for your entire AWS account across all components and regions.
```
$ pipenv run python3 cloudmapper.py collect --account AWS_USERNAME
```

Prepare the results and launch the webserver to display them.

$ pipenv run python3 cloudmapper.py prepare --config config.json --account AWS_USERNAME

$ pipenv run python3 cloudmapper.py webserver --public

Create and attach a security group to the instance to make the site publicly available.

Browse to public DNS address of your virtual appliance on port 8000

Please note that these steps show running this instance with a publicly available website. You can certainly deploy this to a private subnet and access through a bastion server, etc which is recommended. It would also make sense to put this site behind a login which I have noted as an opportunity for further improvement. Be sure to stop this instance when you are done using it.

Further Improvements

Having a readily available virtual appliance that just works is perfect, but there are some further improvements that I think would be handy:

Create a docker image of CloudMapper that can be run as a container. (There are some folks who have built this)
Save the collection data to an external volume so that it doesn’t live in the running appliance.
Create the virtual appliance that can be deployed within other Packer supported platforms, namely vSphere and Azure.
Lock down the website behind a username and password.