Subsequently add second master node for HA-setup via Director

Hi Icinga-Community,

we are currently trying to add a second master node for high-availability to our running, productive Icinga setup. Unfortunately we are have encountered some difficulties regarding the Icinga configs on the new, second master node.

We started with a fresh machine, installed Icinga and run the Node Wizard (created the second master node as satellite). For the following steps we used the official documentation: Distributed Monitoring - Icinga 2
Whereas each our master nodes has its own database backend.
By modifying the zones.conf under “/etc/icinga2/” we were able to get both nodes running. A look at the network traffic (via netstat) showed that the satellites were already communicating with the newly added master node. We have also successfully added the new node as a host via Director in our icingaweb2 frontend. But no new check results could be found in the database of the second master node.

So we looked further on the documentation and found the following:

And from this point it gets more and more messy :confused:
We did this initial sync but afterwards we could not restart the second master node successfully, because of a re-defining error.

E.g.:

critical/config: Error: Object ‘secondmaster.xyz.net’ of type ‘Endpoint’ re-defined: in /var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/agent_endpoints.conf: 1:0-1:39; previous definition: in /etc/icinga2/zones.conf

If we comment out the corresponding object in /etc/icinga2/zones.conf, we get error messages about missing objects.

E.g.:

[2021-05-12 15:19:09 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2021-05-12 15:19:09 +0200] information/cli: Loading configuration file(s).
[2021-05-12 15:19:09 +0200] information/ConfigItem: Committing config item(s).
[2021-05-12 15:19:09 +0200] information/ApiListener: My API identity: secondmaster.xyz.net
[2021-05-12 15:19:09 +0200] critical/config: Error: Validation failed for object ‘node-446.xyz.net!win_disk!Teams_service’ of type ‘Notification’; Attribute ‘command’: Object ‘mail-service-notification’ of type ‘NotificationCommand’ does not exist.
Location: in /var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/notification_templates.conf: 2:5-2:41
/var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/notification_templates.conf(1): template Notification “Teams_serviceNotification” {
/var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/notification_templates.conf(2): command = “mail-service-notification”
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/notification_templates.conf(3): users = [ “team” ]
/var/lib/icinga2/api/packages/director/5484465a-5088-4025-971b-72eeb3ca059c/zones.d/master/notification_templates.conf(4): }

This error log goes on endlessly and seems to list every service on every host we have.

Note: For stability reasons we configurated our main config master over the local confs files (only executing checks and master zone) under /etc/icinga2/ and not with the Director. Notifications etc. were completely defined via the Director.

Now we are completely confused if and how to do this initial sync. We searched the Icinga community, Github and the web and couldn’t find a right answer.

Thank you for your help :slight_smile:

1 Like