Validate services when "--strategy rolling"?


#1

Hello,

Should habitat validate a service is back up before moving on to the next when update strategy is “rolling”?

We just ran into an issue where an update to a package was detected, downloaded, and dutifully habitat restarted all of our nodes one at a time… however, due to the change, the service was no longer able to start and is left in a “flapping” state.

hab svc status also reports the service as “up”.

Perhaps habitat usually attempts to check if a service is up before moving on to the next node if there’s a service check hook? Maybe if our core/consul plan had a service check hook it would have said “well node 1 didn’t come back, let’s not update the other nodes”?

Fortunately, we’re not in a production lab, so we’re not service impacting… but I could see how if someone was running core/consul in production this could have been a major issue.

Anyway, just thinking out loud.


#2

@christophermaier might have some thoughts on this.


#3

I think this might sum up the issue https://github.com/habitat-sh/habitat/issues/5327