I have problems with my Icinga deployment since upgraded to 2.11.0 and Director 1.7.0.
Everytime a new configuration is deployed using Director, lots of host (and their services) lose connection to Satellite/Master servers. Then, if I wait 1 or 2 minutes, all of them recover connection and seems to work fine.
I have read in other topics that must be caused because of an incorrect Zones configuration and synchronization issues.
Since I stated working with Icinga everything has been managed using Director and it’s the first time I see problems.
Also I have read something about a possible fix in 2.11.1 for this specific problem (Zones).
How can I troubleshoot this issue and confirm that could be related with Zones?
This message is shown when applying config so it seems clear:
Warning: you’re running Icinga v2.11.0 and our configuration looks like you could face issue #7530. We’re already working on a solution. The GitHub Issue and our Upgrading documentation contain related details.
Hmm I don’t know if 2.11.1 will solve this problem, it is not clear how to re-implement the old behaviour with keeping the bug fix implemented in 2.11, that’s also mentioned in the linked GitHub issue. At this time, I fairly doubt that this will hit a bugfix release anytime soon.
Since you’re seeing the message inside the Director, I strongly recommend to change the Director configuration and remove the master/satellite endpoint and zone details from it, moving it again into the outside zones.conf
zones.conf on each master/satellite servers updated with endpoints and zones
Director database modified:
update icinga_endpoint set object_type = ‘external_object’ where object_type = ‘object’;
update icinga_zone set object_type = ‘external_object’ where object_type = ‘object’;
Icinga service restarted in all master/satellite servers.
All seems working as expected but still loosing agents connections while deploying new config.
2 minutes later all agents are connected again.