What to use as CI for plans - (local/onprem/hosted, kitchen, delmo, concourse)


Hi, I have written a question in Slack but finally, we thought it would be good to have a broader discussion in the community forum.

The core of my question is “how-to/how you test plans locally, at best in n-tier setup, during the development”.

I have seen some attempts (even here in the forum) to address the question for plan CI. In general there are exmples available with:

  • delmo (search/grep core habitat-plans or gh:starkandwayne/habitat-plans)
  • kitchen (I am aware its used (@sns), but no examples found)
  • concourse ci (pipelines available)
  • terraform (feasible, close to prod deployment, no examples found)

However, there is more topic to discuss, like

  • any framework/best practice on a roadmap
  • options for local testing vs. hosted/public like Travis
  • complex on-prem pipeline to test/build/push
  • platform support (the ability to test on your final platform (not multiplatform testing))
  • unit, integration testing
  • dependencies/plans from multiple origins (let me know if non-sense)
  • validation
  • linting, test configuration in .toml (variable names) fit best practice

Let’s discuss here what are the features of tools mentioned above (or other tools), related to habitat. What tools are welcome and what rather deprecated. What is actually use-case for habitat plan testing vs. whether we are rather heading to more complex end-to-end/cicd scenario.

Regarding Kitchen test framework I have these additional questions:

  1. Is there any kitchen ecosystem for habitat? kitchen-habitat (I haven’t found). How can kitchen be used for habitat? Cookbook under-hood of kitchen to install habitat sup and then the plans? Dedicated image where hab sup is entrypoint? Custom script as kitchen provisioner?

  2. Are we able to test multi-node setup, there were some attempts in the past to workaround a bit a kitchen that dont supports that ootb. (possibly similar way delmo does?)

  3. Plans has frequently dependencies, what way do you deal with? Do you keep that job to hab to download them from “testing/stable” channel or fetch them on your own? Are you testing always whole repo with habitat-plans? Single plan?

  4. I quite belive it should be possible to use as small base image as possible, like alpine. Btw, are there any images with pre-baked hab sup?

  5. Kitchen has suites, so I may use user.toml per suite to test different configurations, am I right that use/upload this file is the correct way to test different setups?


I’ll give a quick answer to get things started. I use Test Kitchen, because it’s readily available and easy to plug into CI/CD frameworks.

To your specific questions:

  1. Is there a kitchen-habitat ecosystem?

Not that I am aware of. If it existed, what would it do? Kitchen is designed to provision infrastructure and run “integration” tests on a per-node basis. Habitat explicitly stays out of provisioning, and is explicitly trying to move the paradigm to services and relationships and orchestration. I can’t immediately see what a kitchen-habitat would do, or how it work work.

Instead, I use kitchen-docker locally, and kitchen-ec2 in a Gitlab CI/CD pipeline. A machine is provisioned, and I have a Chef cookbook that installs Habitat, installs the packages, starts the services, and then Kitchen runs tests on the running instance using inspec.

This allows me to verify that the services start properly, and allows me to test bindings. It also allows me to do some primitive testing that the services are doing the right thing.

  1. I strongly believe Test Kitchen is the wrong tool for the job to attempt multi-node testing, however, you can look at https://github.com/chef/chef-rfc/blob/master/rfc084-test-kitchen-multi.md for ideas.

I think multinode testing would be great, but I think it should be done with a different framework.

  1. Habitat is built to solve dependencies. Just let Hab solve the dependencies and you’re done.

  2. I don’t know of any images with a baked-in supervisor, but I think they’d be easy to do. What you use to build that is up to you. I use CentOS 7, as minimal as it goes. It’s easy to work with, and well-supported by Chef. Anything else would do fine.

  3. Sure you could have as many suites as you like, and yes, you could run different run lists with different user tomls to provide different config.

What I haven’t yet done is worked out a good way to do “multi step” tests with inspec… ie “do this… then do that”. It may be that inspec is the wrong tool for the job here. I may end up writing more sophisticated tests which are delivered as a payload within a cookbook, and run by bats, which just watches for the response. I’ve not decided.

I’ve also not given any thought to testing containerised Habitat services, but this should not really be needed. One of the key ideas of Habitat is your artifact is the same and behaves the same whether you’re running it in your hab studio, on a local VM, in EC2, or in K8S.

I have this plumbed into Gitlab CI/CD, so that merges to master trigger a pipeline, which runs the Kitchen tests, and on success will tag, and publish the cookbook. That’s because I’m currently using Chef as the deployment mechanism for Habitat. If you’re only using the cookbooks to test the Habitat stuff, you may not need that.

Hope some of this is of use to you! Feel free to follow up with thoughts / criticisms.


Some exposition: This is definitely a hard problem and of course its not unique to Habitat. What we’re actually discussing the testing of here is microservices and applications that follow that paradigm. My personal opinion is that unit tests still belong in code, testing the habitat plan isn’t very useful, but testing the behavior of the habitat artifacts after build are quite important because of Habitat’s nature of pushing so much runtime concern as far left as possible. What I shoot for in any testing strategy, functional or otherwise, is the ability to run a test harness by hand as well as via CI. I’m not a fan of having separate testing behaviors between my local development and CI environments.

I think @sns did a great job of outlining some of the reasons why test-kitchen is probably not the right tool in a Habitat context today. For myself, I’ve been using a delmo based test harness that leverages docker-compose. The two things together give me everything I need to be able to test multi-node clusters and varying configurations. I’ve got pipelines that generate generic docker images specifically intended for having packages and configurations injected into them and effectively shell script based test harnesses that utilize those to validate functionality. for an example of this pattern you can check out some of the starkandwayne examples in this repo https://github.com/starkandwayne/habitat-plans/tree/master/postgresql . Basically includes a test directory in each package directory that includes all the things anyone could need to test that package functionally. On a local system you just need to have delmo and docker-compose. In CI it gets only slightly more complicated - you need those binaries and a docker api running somewhere that your test orchestrator can reach. I haven’t tested this in the context of Travis or the like but it works exceptionally well with concourse on-prem.

All of this though I think is a band-aid on some potential future behaviors for the Habitat kit. Who knows if we’ll actually end up biting off the CI portion of the lifecycle, right now I don’t know that we have a clear decision on that, but even without doing so I think there are some things we could do to make CI patterns more consumable for people. We’ve discussed writing up some jenkins pipelines that people can consume themselves and we have concourse pipelines like that today. As builder grows and adds more APIs as well the shape of those things could change. The reason I haven’t personally written a post about the concourse pipelines I wrote are because we don’t yet have them running for prod. I think what we need to be certain to focus on are how folks are testing micro-services functionally today because even in the case where you’re not packaging microservices with Habitat, the nature of the system is one that shares a lot of traits with those paradigms


@epcim, we do things a bit unconventionally at smartB, choosing a blended approach of testing inside the plan.sh and running a suite of Inspec tests against our validation environment(s).

It’s my personal belief that the right long-term approach for Habitat is to allow assertions to be made inside package metadata, so that we can catch issues as far “left” in the build pipeline as possible. Until we have a richer set of features available inside the build Studio this is a little bit tricky. For now, we cram as much as we can into the do_check callback:

do_check() {
  # stand up our testing-specific Postgres service:
  export PGHOST=""
  export PGPORT="6432"
  export PGUSER="admin"
  export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$(pkg_path_for core/gcc-libs)/lib:$(pkg_path_for core/libffi)/lib:$(pkg_path_for core/pcre)/lib"
  export HAB_POSTGRESQL="dynamic_shared_memory_type = 'none'"
  export SMARTB_DATABASE_URL_TEST="postgresql://${POSTGRES_IP}/smartb?client_encoding=utf8"
  export SMARTB_METERDATA_DB_URL_TEST="postgresql://${POSTGRES_IP}/meterdata?client_encoding=utf8"
  rm -rf /hab/svc/postgresql_testdb/* || true
  mkdir -p /hab/svc/postgresql_testdb
  chown -R hab /hab/svc
  HAB_BLDR_CHANNEL=stable hab svc start smartb/postgresql_testdb &
  until hab pkg exec smartb/postgresql_testdb pg_isready --dbname=postgres --port=6432
    echo postgresql starting
    sleep 2
  chpst -u hab createdb --username=admin --no-password smartb
  hab svc status

  # run Python unit tests and perform codestyle checks
  pushd /src
    find . -name "*.pyc" -type f -delete
    pip install --progress-bar off --requirement requirements_dev.txt
    echo "running pep8"
    pycodestyle smartb | tee $pkg_prefix/pep8.txt
    echo "running unit tests"
    py.test --maxfail=1 --junitxml junit.xml --junitprefix smartb/ --cov-report xml --cov-report html --cov smartb
    cp coverage.xml $pkg_prefix/coverage.xml
    pip uninstall --yes --quiet --requirement requirements_dev.txt

  # gracefully clean up test Postgres data
  hab svc stop smartb/postgresql_testdb
  rm -rf /hab/svc/postgresql_testdb/data/*
  echo $pkg_prefix | sed 's@/hab/pkgs/@@' > $pkg_prefix/version.txt
  return $?


Thanks for the responses. They were very useful.

FYI later I found there is an attempt for kitchen-habitat: https://github.com/test-kitchen/kitchen-habitat.

I mostly tend to the eeyun statement:

  1. unit tests still belong in code,
  2. testing the habitat plan isn’t very useful
  3. testing the behavior of the habitat artifacts after build are quite important

(well 2nd point seems controversial, but I guess we are all sure plan.sh CI/build phase cross-check it enough)

So what is the summary:

  • we found the kitchen, not the best todo_check(ol for everything especially not for multimode
  • delmo is good-enough alternative today for simple multi-node + “multi-step” testing
  • referring to (3) eeyun point, future is possibly in adding some CI functionality to habitat itself, like assertations/inspec style.
  • python unit tests, and codestyle checks (even for all .sh) should be part of "default do_check() IMHO already in habitat.

Regarding do_check() bixu shared, If I do understand here should ideally be only the assertation, at best linked. and the “setup phase” (non-cross platform, possibly duplicated in .ps1) is something we should avoid. Actually what you do here is what I call “client” role (whre you create DB). In K8s you would call side-car container to init your pgsql. I believe here it’s not worth to “spoil” habitat with kind of this functionality and instead I would suggest use chef/salt, application itself to configure these. Some chef/salt approach I have here: http://apealive.net/post/helm-charts-wheelhouse/#_engine (passing yml metadata by pipe to “salt” container to do the job). Anyway for habitat that should be more easy.

I would even say, this do_check () or any other assertations should be more “case” for “composite” plans, that possibly should have a piece of the CI workflow or rather say “cross-habitat artifacts behavior”.

Some example, best practice how to do such from the composites would be beneficial.