Request for Input - Handling breaking changes to core plans when software is not being updated


#1

Hello all!

This is not quite at the point of an RFC yet, so I’m starting the discussion here.

I’m wondering how we should handle making breaking changes to core plans when we are not doing a major update of the software that plan packages.

A good example is in this pull request - which greatly simplifies the configuration for core/mongodb. This is a potentially breaking change to anyone using core/mongodb. Normally I would convey that it is a breaking change through use of semantic versioning, but in this case (as is the case with most of our core plans) the plan version tracks the version of the software we are packaging.

We have a few options as I see them, but would love to hear ideas for more:

  1. Wait until a major version change of the software before we do a breaking change to the plan (not ideal at all)
  2. Somehow communicate that this is a breaking change - maybe by appending something to the version number? This also seems less than ideal.

What are your thoughts on this, oh great Habitat community?


#2

Hey,

I’ve been thinking about a way to version the configuration of a package since I first used the great toml based templates.

Today I decided to spend some time to try to figure out what we could do, and put it in a blog post: https://romain.sertelon.fr/tech/habitat-service-versioning-proposal.html

TL;DR I propose that we include an optional pkg_svc_version in packages to version the whole service API, represented mainly by the default.toml file. We could use this information in many places to help ops manage the configuration changes more easily thanks to habitat.


#3

That is a great post, ty @rsertelon!


#4

I think what this comes down to is a need for additional versions that track aspects of the plan, and those should be metadata of the plan, but have no impact on the build or versioning of packages.

In core plans, we want to use the pkg_version as the SAME version as the upstream package because it makes intuitive sense for other plans that are depending on those packages. If you pin your dependencies and depend on core/openssl/2.016, then it’s VERY CLEAR, right there in the plan of not only the version of the package that you’re depending on, but on the precise version of openssl. Of course, this bubbles up in the http gateway and any potential programmatic auditing you might be doing of dependencies in your infrastructure, so this is a important, simple, and clear mechanism for communicating which version of the software is running.

In order to solve this particular problem, we could include an optional, metadata version called “pkg_build_version”. This is a version that is representative of the state of the build and configuration in the plan at build time so that we can communicate to users major changes that occur to the compile or configuration without changing the underlying source software.

So if this isn’t effecting the dependency system in Habitat, what does it do?

The problem here is that a human needs to know what’s going on, and not a robot. So we should give the human an update with an email, rss feed, or notification.

The idea is for Builder to have a new feature where users can subscribe to updates for changes to plans. This feed can monitor not only the pkg_version but the pkg_*_version as well, so that users can subscribe to updates.

The unique opportunity here is to leverage the dependency management system so that you could opt to receive updates for transitive dependencies – that way you get a clear understanding about what might have happened when the sands shift out from under you.

For example, for OpenSSL, you have a build version as so:

pkg_version=2.0.16 (matches the upstream source version)
pkg_build_version=15.0.1

This gives you flexibility because you may want to compile OpenSSL in a particular way, and make breaking changes in the way you compile OpenSSL, even though you did not change the upstream source code in use.


#5

To quote @nellshamrell “Strong opinions loosly held”
I’ve been writing this across a few hours and multiple interruptions, so apologies in advance if I’m incoherent.

My first reaction to this question is “We need a way to version Plans”. (Note: I use the capital Plan here to denote the set of inputs that go into making an artifact.) This would allow us to communicate breaking changes in how we build the software, configuration values, or how the service runs.

However, as I’m noodling on it I’m not entirely sure it’s the right answer, or at least not a complete answer. Part of my reasoning is that in the cases of user software ( i.e. the user owns both the software and the plan ), the versions of the two would be the same and so changes to the Plan would be communicated through major/minor version bumps of the software it packages and Plan version would be redundant. (I could absolutly be wrong here). That leaves “core-plans-like” software that doesn’t control the version.

With core-plans-like software, providing a Plan version, while it will communicate to users that it may break them, still leaves them with the choice Upgrade and eat the pain, or Stop, which is to me an anti-pattern with hab. There are reasons and cases for stopping/pinning, but by and large I believe this is true.

One question I have is should people be running core-plans directly? I wibble back and forth on this. By maintaining their own plan that is just a thin layer on top of a core-plan (or running an on-prem depot), a user can manage channel promotion themselves and control what gets deployed. This does start to lean away from always consuming updates and lead us back to batching changes. It also only solves for the service aspect and doesn’t handle the case of build/runtime deps changing, build options changing, etc.

My next thought is rather than versioning plans, we version Core-plans as a whole and limit breaking changes to a regular cadence. We do this to some extent with base plans refreshes already. If we were to limit breaking changes to the refreshes, it would allow our users to plan accordingly. This also isn’t a complete solution in my mind, but I think it gets us closer than plan versions. It does start to lead us away from the rolling release model, but there will always be a set of Plans that need to move together and have regular release cadences (glibc, etc) that could be used as guideposts.

I think a lot of the breaking changes issues (for services) could be mitigated by having upgrade testing as part of the pre-merge. Lifecycle hooks are also there to help us move from one version to the next, though maintaining those could become cumbersome. I also suspect that as Plans mature, breaking changes will tend to occur only when we’re updating the underlying software due to its configuration/dependencies changing.

I’ve considered that we may be able to use channels to signify updates, but always get back to we end up batching updates and still break the user, or they stop updating both of which are bad in my opinion.

What I think I’m landing on is a regular cadence of windows to introduce potentially breaking changes. That seems like a good first step as we iterate toward a better solution. We’ve seen a lot of activity around testing recently which is AMAZING, but I think there are some standard tests that we could provide (like service upgrades). I still don’t know that that would get us where we need to be and it’s possible there are more primitives/metadata needed, but my inclination is to work with what we have first and add features as we find them absolutely necessary.


#6

I ran into this same problem when I was working to make the stock nginx plan more useful. Maintaining existing support and behavior for the redirector config that got merged in to implement one specific use within the primitive capabilities of handlebars was making my config template way nastier than it needed to be. I would have loved to just propose depreciating that config instead (since IMHO it has no business being a first class config)

The ability to write a lifecycle hook that can transform config before it is applied, which I’ve seen discussed in a few contexts, could potentially be a good way to help here. A config transform hook could help make up for the limitations of handlebars for example by transforming deprecated config automatically of possible or throwing a detailed error about it