Init hook not run


#1

RHEL7, hab-sup run via systemd

systemctl stop hab-sup-default
ps ax | grep hab -> no results
rm -fr /hab/sup/default

systemctl start hab-sup-default
hab svc status -> no services loaded

hab svc load chrisortman/sparc-request

supervisor output:
Jun 28 08:18:58 rsmt-appdev01.icts.uiowa.edu hab[5219]: hab-launch(SV): Supervisor process killed by signal 15; shutting everything down now
Jun 28 08:22:47 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(MR): Supervisor Member-ID e44c358d0ab249ae85d4a8fd70aaa26c
Jun 28 08:22:47 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(MR): Starting gossip-listener on 0.0.0.0:9638
Jun 28 08:22:47 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(MR): Starting ctl-gateway on 127.0.0.1:9632
Jun 28 08:22:47 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(MR): Starting http-gateway on 0.0.0.0:9631
Jun 28 08:25:27 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(AG): The chrisortman/sparc-request service was successfully loaded
Jun 28 08:25:29 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(MR): Starting chrisortman/sparc-request
Jun 28 08:25:29 rsmt-appdev01.icts.uiowa.edu hab[5906]: sparc-request.default(UCW): Watching user.toml
Jun 28 08:25:29 rsmt-appdev01.icts.uiowa.edu hab[5906]: sparc-request.default(HK): Hooks compiled
Jun 28 08:25:29 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(SR): Reattached to sparc-request.default

hab svc stop chrisortman/sparc-request

supervisor output:

Jun 28 08:26:39 rsmt-appdev01.icts.uiowa.edu hab[5906]: hab-sup(AG): Supervisor stopping chrisortman/sparc-request. See the Supervisor output for more details.
Jun 28 08:26:41 rsmt-appdev01.icts.uiowa.edu hab[5906]: sparc-request.default(SR): Service stop failed: hab-sup(ER)[components/sup/src/error.rs:509:9]: NoPID:

There must be some state hanging out somewhere?? but I can’t figure out where that would be :confused:


#2

One possibility is that you had a stale PID file for the service in /hab/svc/SERVICE_NAME/PID. Right now those PID files are not reliably cleaned up (see this PR for progress there https://github.com/habitat-sh/habitat/pull/5236). Since the PID file was stale, when you issued the stop command, Habitat couldn’t kill the service because the process it thought it was tracking had already exited or never existed in the first place.

We are currently working around a similar problem with the following hack in a systemd init script:

ExecStartPre=/bin/sh -c 'rm -f /hab/svc/*/PID'

Cheers,

Steven


#3

That was it! Thanks a bunch you probably saved my morning.