To quote @nellshamrell “Strong opinions loosly held”
I’ve been writing this across a few hours and multiple interruptions, so apologies in advance if I’m incoherent.
My first reaction to this question is “We need a way to version Plans”. (Note: I use the capital Plan here to denote the set of inputs that go into making an artifact.) This would allow us to communicate breaking changes in how we build the software, configuration values, or how the service runs.
However, as I’m noodling on it I’m not entirely sure it’s the right answer, or at least not a complete answer. Part of my reasoning is that in the cases of user software ( i.e. the user owns both the software and the plan ), the versions of the two would be the same and so changes to the Plan would be communicated through major/minor version bumps of the software it packages and Plan version would be redundant. (I could absolutly be wrong here). That leaves “core-plans-like” software that doesn’t control the version.
With core-plans-like software, providing a Plan version, while it will communicate to users that it may break them, still leaves them with the choice Upgrade and eat the pain, or Stop, which is to me an anti-pattern with hab. There are reasons and cases for stopping/pinning, but by and large I believe this is true.
One question I have is should people be running core-plans directly? I wibble back and forth on this. By maintaining their own plan that is just a thin layer on top of a core-plan (or running an on-prem depot), a user can manage channel promotion themselves and control what gets deployed. This does start to lean away from always consuming updates and lead us back to batching changes. It also only solves for the service aspect and doesn’t handle the case of build/runtime deps changing, build options changing, etc.
My next thought is rather than versioning plans, we version Core-plans as a whole and limit breaking changes to a regular cadence. We do this to some extent with base plans refreshes already. If we were to limit breaking changes to the refreshes, it would allow our users to plan accordingly. This also isn’t a complete solution in my mind, but I think it gets us closer than plan versions. It does start to lead us away from the rolling release model, but there will always be a set of Plans that need to move together and have regular release cadences (glibc, etc) that could be used as guideposts.
I think a lot of the breaking changes issues (for services) could be mitigated by having upgrade testing as part of the pre-merge. Lifecycle hooks are also there to help us move from one version to the next, though maintaining those could become cumbersome. I also suspect that as Plans mature, breaking changes will tend to occur only when we’re updating the underlying software due to its configuration/dependencies changing.
I’ve considered that we may be able to use channels to signify updates, but always get back to we end up batching updates and still break the user, or they stop updating both of which are bad in my opinion.
What I think I’m landing on is a regular cadence of windows to introduce potentially breaking changes. That seems like a good first step as we iterate toward a better solution. We’ve seen a lot of activity around testing recently which is AMAZING, but I think there are some standard tests that we could provide (like service upgrades). I still don’t know that that would get us where we need to be and it’s possible there are more primitives/metadata needed, but my inclination is to work with what we have first and add features as we find them absolutely necessary.