Doing things the hard way: Using Gitlab, Habitat and Chef to automatically issue letsencrypt certificates

TLDR
I started out with using habitat to just deploy static frontend files to our nginx server. However I quickly found out that you can’t use most of the interesting habitat features that way. The nginx setup however was deeply integrated with the ssl setup using acme http challenge, so I needed to find a way to move everything to habitat. I figured that the only reasonable way to move forward was to only have one node take care of the certificates, however we did not want to manage another node to do just that. This post describes a way to set up gitlab CI to take care of cert issuance and renewal together with chef and habitat. Any feedback or suggestions how to do things less complicated is appreciated as I might turn this into a blog post.

Gitlab

First we need to set up quite a few secret variables with the gitlab CI job:

ACME_EMAIL: email address used to issue the certs
HAB_PUB_KEY: Habitat public key content e.g. from ~/.hab/cache/keys/*.pub (this will be a mutli-line string)
HAB_SIGN_KEY: Habitat priv key content e.g. from ~/.hab/cache/keys/*.sig.key (this will be a mutli-line string)
HAB_KEYS_NAME: name of the habitat keys (what * would expand to in previous two variables: org-timestamp)
HAB_TOKEN: Habitat access token
HAB_ORIGIN: Habitat origin name
DNS_USER: auth user for dns provider
DNS_TOKEN: auth token/pass for dns provider 
DOMAIN: domain name e.g.: example.test
GITLAB_ACCESS_TOKEN: Gitlab personal access token (read_registry is sufficient)
ENCRYPT_CERT_SECRET: An arbitrary string used to encrypt the private key and cert
ENCRYPT_ARTIFACT_SECRET: An arbitrary string used to encrypt the config directory

cert project

The code which will be executed when the CI is run consists of a .gitlab-ci.yml file and three bash scripts:

.gitlab-ci.yml:

image:
  name: certbot/certbot
  entrypoint: [""]

deploy:
  stage: deploy
  variables:
    # ACME_URL: "https://acme-staging-v02.api.letsencrypt.org/directory"
    ACME_URL: "https://acme-v02.api.letsencrypt.org/directory"
  before_script:
    - apk add curl
  script:
    # - 'curl --location --header "PRIVATE-TOKEN: $GITLAB_ACCESS_TOKEN" -o artifact.zip --silent "https://gitlab.com/$CI_PROJECT_PATH/-/jobs/artifacts/master/raw/artifact.zip?job=deploy"'
    # - openssl enc -aes-256-cbc -d -pass env:ENCRYPT_ARTIFACT_SECRET -in artifact.zip | tar xzvC /
    - certbot certonly --manual --preferred-challenges dns-01 -m $ACME_EMAIL --manual-public-ip-logging-ok --agree-tos --no-bootstrap --non-interactive --manual-auth-hook './src/authenticate.sh' --manual-cleanup-hook './src/cleanup.sh' --deploy-hook './src/habitat.sh' --domains *.$DOMAIN --cert-name $DOMAIN --server $ACME_URL
    - tar -zc /etc/letsencrypt/ | openssl enc -aes-256-cbc -out artifact.zip -pass env:ENCRYPT_ARTIFACT_SECRET
  artifacts:
    paths:
      - artifact.zip

We instruct the gitlab CI runner to use the official certbot docker image, override the entry point (as that defaults to the certbot executable) and define a stage which will execute the actual code. The certbot image is alpine based, so we are installing curl before doing anything, as the preinstalled wget does not support http methods. (The commented out lines will be reenabled later)

The script itself just runs certbot, which calls different bash scripts on its hooks:

--manual-auth-hook: authenticate.sh
--manual-cleanup-hook: cleanup.sh
--deploy-hook: habitat.sh

authenticate.sh takes care of adding the dns challenge certbot provides as a means of authentication. In this case the script uses the name.com api and is quite similar to the example from the certbot documentation.

CHALLENGE_HOST="_acme-challenge"

echo "setting validation host: $CHALLENGE_HOST answer: $CERTBOT_VALIDATION for domain: $CERTBOT_DOMAIN"

# add challenge to dns
RECORD_ID=$(curl -u "$DNS_USER:$DNS_TOKEN" -X POST --silent --data '{"type":"TXT","host":"'"$CHALLENGE_HOST"'","answer":"'"$CERTBOT_VALIDATION"'","ttl":300}' -s "https://api.name.com/v4/domains/$DOMAIN/records" | python -c "import sys,json;print(json.load(sys.stdin)['id'])")

echo "dns record id: $RECORD_ID"

# Sleep to make sure the change has time to propagate over to DNS
sleep 25

# Save info for cleanup
if [ ! -d /tmp/CERTBOT_$CERTBOT_DOMAIN ];then
  mkdir -m 0700 /tmp/CERTBOT_$CERTBOT_DOMAIN
fi
echo $RECORD_ID > /tmp/CERTBOT_$CERTBOT_DOMAIN/RECORD_ID

The id of the DNS record is persisted on the file system, so we will be able to remove it within the cleanup script.

cleanup.sh:

if [ -f /tmp/CERTBOT_$CERTBOT_DOMAIN/RECORD_ID ]; then
  RECORD_ID=$(cat /tmp/CERTBOT_$CERTBOT_DOMAIN/RECORD_ID)
  echo "removing validation record with id: $RECORD_ID from dns: $DOMAIN"
  rm -f /tmp/CERTBOT_$CERTBOT_DOMAIN/RECORD_ID
  curl -u "$DNS_USER:$DNS_TOKEN" -X DELETE --silent https://api.name.com/v4/domains/$DOMAIN/records/$RECORD_ID
fi

Now we should be able to issue a new certificate. The habitat.sh script takes care of encrypting the certs and uploading them to builder. In order to do so I found that I needed both the private and the public key as well as the habitat token. We are just going to restore these from the data we keep in the gitlab variables. Then fullchain.pem and privkey.pem will be encrypted using tar, openssl and the secret that is stored in the gitlab variable. Habitat is installed, the package is build and it gets uploaded to builder.
I found that habitat runs fine within the gitlab CI environment even though that isn’t the case when run inside a plain docker container.

habitat.sh:

export HAB_NONINTERACTIVE=true

# restore habitat keys
mkdir -p /hab/cache/keys/
echo "$HAB_SIGN_KEY" > /hab/cache/keys/$HAB_KEYS_NAME.sig.key
echo "$HAB_PUB_KEY" > /hab/cache/keys/$HAB_KEYS_NAME.pub

# create an encrypted archive of the certificates, which will be shipped to the habitat builder
mkdir results
tar -zcv --dereference --directory /etc/letsencrypt/live/$DOMAIN/ fullchain.pem privkey.pem | openssl enc -aes-256-cbc -out results/certs.zip -pass env:ENCRYPT_CERT_SECRET

# install habitat
curl --silent https://raw.githubusercontent.com/habitat-sh/habitat/master/components/hab/install.sh | /bin/ash

# build and upload habitat package
hab pkg build .
hab pkg upload --auth "$HAB_TOKEN" --channel stable results/*.hart

After all that is done, the last line from the .gitlab-ci.yml script part zips and encrypts the entire /etc/letsencrypt directory. This file is then stored as the artifact for the CI pipeline.
There are two lines within .gitlab-ci.yml that are commented out. These take care of downloading the last successful artifact from the previous job and restores the /etc/letsencrypt directory. Setting this up consists of two stages: The initial commit with the two lines commented out (as there will be no previous job to download the artifact from), and a second commit, which enables persistence so we will not issue new certs on each CI run. This could be probably baked into the script as well, but commenting these two lines out also provides a means of starting with a fresh certbot account and new certs.

The habitat plan itself is very simple, as it just needs to package a static compressed file:

do_build() {
  return 0;
}

do_install() {
  cp results/certs.zip "${pkg_prefix}/certs.zip"
}

nginx setup

nginx itself is a pretty basic habitat setup, which has a pkg_deps on core/nginx. We are using the init hook to decrypt the certs using a password which has been persisted to the file system using chef. The actual certificates will be placed into our nginx package within the directory ssl/<certVersion>/<certTimestamp>. If there is no cert package, it just creates two empty files. Both variables certVersion and encryptedCertsPath will be updated as a runtime configuration:

#!/bin/sh
mkdir -p "/var/log/nginx"

CERT_PATH={{pkg.path}}/ssl/{{cfg.certVersion}}
mkdir -p $CERT_PATH
if [ -f {{cfg.encryptedCertsPath}}/certs.zip ]; then
  openssl enc -aes-256-cbc -d -pass file:/etc/.ssl -in {{cfg.encryptedCertsPath}}/certs.zip | tar xzvC $CERT_PATH
else
  touch $CERT_PATH/fullchain.pem
  touch $CERT_PATH/privkey.pem
fi

The SSL part inside the nginx conf then just looks like this:

ssl_certificate {{pkg.path}}/ssl/{{cfg.certVersion}}/fullchain.pem;
ssl_certificate_key {{pkg.path}}/ssl/{{cfg.certVersion}}/privkey.pem;

and just in case anyone is looking for how to include the mime.types within the nginx config of a package which depends on core/nginx:

include       {{ pkgPathFor "core/nginx" }}/config/mime.types;

Don’t forget to add the two configurable variable to the default.toml:

[cert]
encryptedCertsPath = ""
certVersion = ""

Chef

Lastly we need to find a way to trick habitat into reloading the certs whenever we install a new cert package. The only way I was able to do this is by passing the cert version as a habitat configuration, as packages which only include static files do not get run by the supervisor. That is why we also are decrypting the files within the nginx init hook.

So, we need to read the secret to decrypt the certs and store them on the filesystem so that habitat can pick that up:

habitat_bag = data_bag_item('habitat', 'my bag')

file '/etc/.ssl' do
  content "#{habitat_bag['secret']}"
end

Then we only need to install the cert package

hab_package 'myorg/cert' do
  action :install
  channel 'stable'
end

Get the path for the cert package

ruby_block 'get cert dir' do
  block do
    node.run_state['encryptedCertsPath'] = shell_out('hab pkg path myorg/cert').stdout.chomp
  end
end

Install our fronted package, which depends on nginx

hab_service 'myorg/www' do
  action :load
  strategy 'at-once'
  channel node.habitat['channel']
end

And lastly push the new config to the supervisor. The certVersion just contains the version substring of the path e.g.: 0.1.0/20180608165950 as we cannot pass that value from the init hook to the nginx.conf template.

hab_config 'www.default' do
  config(
    lazy {
      {
        encryptedCertsPath: node.run_state['encryptedCertsPath'],
        certVersion: node.run_state['encryptedCertsPath'].split('/')[5, 2].join('/')
      }
    }
  )
end

In order to test cert renewal one can add the --renew-by-default option to the certbot command, which will renew the certs on every run. Now don’t forget to schedule the gitlab CI run regularly.

I just deployed this to our test environment, so we will see what happens when certbot should automatically renew the certs after 60 days. I could not find a way to tell certbot to renew the certs any earlier, so time will tell if that works as well…

1 Like

I’m looking forward to that rotation moment. :slight_smile:

1 Like

Almost every step regarding cert renewal can be tested by adding the --renew-by-default flag, which renews certs on every build. Unless certbot is broken and does not renew certs 30 prior to expiration not much unexpected should happen.

I also put the project up on github so its easier to follow along: https://github.com/st-h/letsencrypt-habitat-runner

As the end of the year approaches, it seems like a good time to recap on this exciting mashup of workarounds…
All in all despite its complexity things have been going pretty well. Mostly if there were any issues they were related to regressions or incompatible third party changes.

  • the init hook in the hab plan is not the right place to do decrypt the archive, as it is not necessarily called every time we need it. Moving that code to the run hook is more appropriate. Luckily we caught this before the first renewal was due
  • gitlab had a security issue. They way they resolved the issue broke the artifact download api we were using, which lead to having to download the whole artifact and manually extract the relevant bits.
  • we used the ash shell to install habitat, as that’s what the Certbot docker image comes with. However, since a few releases, hab no longer installs using ash and prints out a syntax error: /bin/ash: syntax error: unexpected "(" (expecting "}"). Quick fix obviously was to install bash before trying to install hab.