Understanding cluster-zone check with 2 Root-(Parent)-Nodes

Hi,

I’m using the cluster-zone command in 2 ways:

  1. On the satellites, it checks whether the satellite is connected to the parent-zone (consisting of 2 nodes) → this works perfectly as I expect: If only 1 parent node is not available, the check still says “OK”.

Because I wouldn’t recognize a disconnected zone on the parent node, I also check in the opposite direction:

  1. On the parent zone: For each child zone, I check whether the child zone is connected. The check runs on one of the parent nodes, distributed by icinga2 scheduler.

Problem: The cluster-zone check says “CRITICAL” when it runs on the one node the satellite is currently not connected to. How can I influence this behaviour? I would expect a non-OK-state only if the zone isn’t connected at all.

I see an check-command property “ha_mode 0” within director, but don’t know how to set it (and don’t no if this is the right value).

Any ideas, hints, insights?

Thanks!

Hi Dominik,

thank you, I know how to handle variables. Problem is that “ha_mode” is, according to documentation, not a parameter of the cluster-zone check.

https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#cluster-zone

Looks like a bool so 0 or 1 or true and false. Maybe experiment with it on the cli and then model it in the director to apply it on the hosts or services.

defining and setting a var “ha_mode” didn’t have any effect (I suppose because the ha_mode in question is not a var.ha_mode)

Thank you nevertheless

I could be missing something here, but when an endpoint has a parent zone with 2 endpoints, that zone definition needs to have 2 parent endpoints, and hence the child connects to both parents.
This is one reason we built the clustergraph module, to find issues like this, where a child has been misconfigured to only have 1 parent when there are two endpoints in the parent zone.