Disabling checker on one of two endpoints in a zone...causes Overdue's

Recently I ran into an issue and I wanted to check that I understand the check balancing feature of Icinga2 correctly.

The setup was two systems, both in the master (the only) zone. Each had its own data store (using icingadb) and each had “checker” enabled to balance check load.

That setup was running fine.

Issue came up where we needed to move the checks exclusively to one of the two machines (the primary “config master” system). I tried just disabling the checker feature on the secondary system, but the hosts and services that had been checked on that system just started coming up “overdue”.

I expected the system that still had “checker”, would just take over…while both systems would receive state data and do the logging to databases accordingly. (We wanted the keep the data redundancy, even though the check redundancy would be stopped).

What seems to be happening…is Icinga still wants to balance checks to the other system in the same zone, but checker is off, so they never run. If I shut Icinga down completely on the secondary system, then the checks do all run on the primary…but leaving the setup like that will result in data loss on the secondary.

Give as much information as you can, e.g.

  • Version used (icinga2 --version) 2.14

Hello Marc!

In addition to this you can try replacing the current HA with a parent-child hierarchy. Simply put all checkables in the child zone.