Prometheus service discovery from Habitat ring


#1

I’m using Habitat to run some Prometheus Exporters, which need to register themselves with the Prometheus server (also in Habitat).

Originally I was just using a --bind for each exporter service to the Prometheus service. This was then rendered into the Prometheus config as a group of static hosts (like in core-plans). This worked well - with one big issue, #each only gives live services in newer sup versions. That means a host shutting down cleanly will be removed from my config. This is bad as we don’t want machines to flap in and out of monitoring config. We want them to only be removed when a machine is forcibly removed from the Ring.

I’ve written a small Rust binary which runs as a sidecar and queries the ring on a schedule, updating a file which can be added as a file_sd_configs: setting for Prometheus (which hot reloads it as it changes). You give it a list of services you are interested in, each one needs to expose a metric-http-port config element for the exporter port.

It explicitly includes all supervisors which are linked to the service (in /census) and explicitly excludes supervisors which have been departed with hab sup depart.

I’m cleaning up the code at the moment, and have to write a README etc…, but was just wanting to gauge interest / see if there is some way of doing this in Habitat which I’ve missed :stuck_out_tongue: