Building AMI's Incrementally With Packer and Make

Date:

At Qwaya we run our servers in AWS. We build our own base AMI’s in order to ensure a stable base infrastructure and to be able to quickly launch new servers when needed.

We have a few AMI’s arranged as described in the image below:

Qwaya AMI’s

  • The base image holds all the packages we want installed on all hosts, it sets up docker, logspout logging to Papertrail and [Datadog][datadog] monitoring.
  • The Web and Worker images holds different systemd files for different parts of our application.
  • We use Buildkite for building and deploying, which uses a bring-your-own technique for build servers. We have two kinds, one for building and one for deploying.

While we really should have our configs in something like Consul, we’re not there at the moment. In order to limit the number of AMI rebuilds, we try to place configs in cloud-init to limit rebuilds.

Make

We use Make as our build tool to only rebuild the necessary AMI’s after a config change. Normally, Make builds output files from source files which are all local. However, the AMI’s we’re building are of course not local, they’re stored in the AWS cloud.

Instead we generate a proxy target file locally, one for each AMI, that stores the generated AMI id. This file is then the target for building the AMI.

Again, AMI are not stored locally, so we need to make sure that the generated AMI id files are available when building AMI’s on other machines. We upload the generated AMI id files to an S3 bucket, and the Make file downloads all them before commencing the build.

It basically looks like this:

# Set a variable for each AMI file
BASE_AMI_ID_FILE=$(AMI_ID_DIR)/base_ami.txt
WEB_AMI_ID_FILE=$(AMI_ID_DIR)/worker_ami.txt
...

# Set a variable that holds all AMI id files
AMI_ID_FILES=$(BASE_AMI_ID_FILE) $(WEB_AMI_ID_FILE) ...

# For each AMI id file, declare dependencies and the build command
$(BASE_AMI_ID_FILE): $(shell find ansible/roles/... -type f) packer/base_ami.json ansible/base_ami.yaml
     build_packer.sh centos_ami base_ami

...

# Finally, declare a .PHONY target that downloads the latest AMI
# id files and builds the AMI's
amis: get-latest-ami-ids $(AMI_ID_FILES)

Packer

As can be seen in the Make script above, we have a local script build_packer.sh that runs Packer. It takes two parameters, the box name to base the box on, and the box name to build. The names map to the AMI id files mentioned above, and the same name is used for the Packer json file.

# syntax: build.sh <from_box> <to_box>

if [ $# -eq 0 ]
then
    # No params given, exiting
    exit 1
fi

build_from=$1
build_target=$2

ami_dir=build/ami_ids
build_file=packer/${build_target}.json
build_result_file=${ami_dir}/${build_target}_packer_result.txt
new_ami_id_txt_file=${build_target}.txt
new_ami_id_txt_file_path=${ami_dir}/${new_ami_id_txt_file}
old_ami_id_file=${ami_dir}/${build_from}.txt

mkdir -p ${ami_dir}

export BRANCH_NAME=$(git rev-parse --abbrev-ref HEAD)
export COMMIT_ID=$(git rev-parse --short HEAD)

# Reading base ami id from ${old_ami_id_file}
SOURCE_AMI_ID=`cat ${old_ami_id_file}`
if [ -z ${SOURCE_AMI_ID} ]
then
    # No source AMI id found
    exit 1
fi
# Current base ami id: ${SOURCE_AMI_ID}
export SOURCE_AMI_ID

# Run Packer to build box ${build_target} using ${build_file}
# tee the build log to a result file
packer build ${build_file} 2>&1 | tee ${build_result_file}

# Check for errors in log file
if egrep -q 'Non-zero exit status|Error' ${build_result_file}
then
    echo "Build failed, exiting"
    exit 1
fi

# Every now and there's a glitch in the AWS
if grep -q "Build 'amazon-ebs' errored:" ${build_result_file}
then
    echo "Amazon error in build, exiting"
    exit 1
fi

# Find the newly created AMI id in the output
new_ami_id=`cat ${build_result_file} | awk '/^us-east-1/{print $2}'`

# Writing new ami id ${new_ami_id} to file
echo ${new_ami_id} > ${new_ami_id_txt_file_path}

if [ -n "${new_ami_id}" ]; then
    # Updating ${new_ami_id_txt_file} with value ${new_ami_id}
    aws s3 cp ${new_ami_id_txt_file_path} "s3://ami-ids/${new_ami_id_txt_file}" --acl public-read
fi

Caveats

We version all of our code using Git and Github. In a new clone of Git, the modification time of all files is set to the time of the clone, not the time they were added to Github.

This means that on a new clone, all AMI files have the same modification date and all AMI’s are subsequently rebuilt when the build is run. This is normally just a problem when we provision new Buildkite build machines.

Conclusion

We find this a decent way to build AMI’s that depend on each other. However, building an AMI takes a few minutes, so I would advice against creating large tree structures which would cause a complete rebuild to take hours rather than minutes.