Icinga Master 1 not accepting configuration updates from (new) Master 2

krappi01 · July 28, 2022, 3:18pm

Hello team,

we followed the instructions from https://icinga.com/blog/2020/10/01/how-to-set-up-high-availability-masters/ to add a second master to our environment. But it seems that the old master (monit1) doesn’t accept config updates from the new master (monit2). Neither the new master host with it services appear in icingaweb nor a new test host I creaeted on the new master.
In the logs from master 1 I see when the new master restarts:

[2022-07-28 17:02:22 +0200] information/ApiListener: New client connection for identity 'monit2' from [::ffff:192.168.9.78]:50364
[2022-07-28 17:02:22 +0200] information/JsonRpcConnection: Requesting new certificate for this Icinga instance from endpoint 'monit2'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Sending config updates for endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Syncing configuration files for zone 'monit1' to endpoint 'monit2'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Syncing configuration files for global zone 'director-global' to endpoint 'monit2'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Finished sending config file updates for endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Syncing runtime objects to endpoint 'monit2'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Finished syncing runtime objects to endpoint 'monit2'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Finished sending runtime config updates for endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Sending replay log for endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Replayed 64 messages.
[2022-07-28 17:02:22 +0200] information/ApiListener: Finished sending replay log for endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Finished syncing endpoint 'monit2' in zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/JsonRpcConnection: Received certificate request for CN 'monit2' signed by our CA.
[2022-07-28 17:02:22 +0200] information/JsonRpcConnection: The certificate for CN 'monit2' is valid and uptodate. Skipping automated renewal.
[2022-07-28 17:02:22 +0200] information/ApiListener: Applying config update from endpoint 'monit2' of zone 'monit1'.
[2022-07-28 17:02:22 +0200] information/ApiListener: Ignoring config update from endpoint 'monit2' for zone 'director-global' because we have an authoritative version of the zone's config.
[2022-07-28 17:02:22 +0200] information/ApiListener: Ignoring config update from endpoint 'monit2' for zone 'monit1' because we have an authoritative version of the zone's config.
[2022-07-28 17:02:22 +0200] information/ApiListener: Received configuration updates (0) from endpoint 'monit2' are equal to production, skipping validation and reload.

What I tried to solve the problem witout success on both masters:
(It was accept_config = false before on the old master)

/etc/icinga2/features-available/api.conf
object ApiListener "api" {
  accept_config = true
  accept_commands = false

  ticket_salt = TicketSalt
}

setting the TicketSalt in constants.conf (same value as on old master - I don’t know if thats the right way?)
changing the object Zone in constants.conf so that it matches with the old master

My zones configs look like this:

new master:

object Endpoint "monit1" {
        host = "192.168.9.77"
        port = "5665"
}

object Zone "monit1" {
        endpoints = [ "monit1", "monit2" ]
}

object Endpoint "monit2" {
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true
}

old master:

object Endpoint "monit1" {
}

object Endpoint "monit2" {
    host = "monit2"
}

object Zone "monit1" {
  endpoints = [ "monit1", "monit2" ]
}

object Zone "global-templates" {
  global = true
}

object Zone "director-global" {
  global = true
}

Additionally, since we have icinga2 running on the new master, we also have some checks, especially for the old master, that are overdue. When clicking on “check now” for these overdue checks we get: Can’t send external Icinga command: 404 No objects found. We didn’t have overdue checks when there was only one master. When stopping icinga2 on the new master “check now” works again and there are no overdue checks anymore. Maybe this problem is related.

What did I wrong?

I juist want to make sure the new master also appears in icingaweb with all its (default) services and maybe later when I deploy with Icinga director on master2 I also have the changes on master1.

Really appreciate your help
Markus

log1c · July 29, 2022, 7:24am

It sounds like that you have set up the 2nd master as a master node as well and now ended up with two master servers that both think they are the one that holds the configuration.
If you followed the howto from the link and added the 2nd master as a satellite first, that shouldn’t happen.

As master1 was the one you have set up first it is the configuration master, so it will refuse any incoming config from other nodes.
That means the Director will have to have master1 as its configuration master in the endpoint section

Meaning master1 is also the one where the Director pulls its external config from via the kickstart wizard

krappi01 · July 29, 2022, 10:59am

Yes I did configure it as a satellite first.
Maybe syncing the local services (I mean those checks for the new master himself) from Master 2 to Master 1 was the wrong approach and I should add the checks for the new master with icinga director. Director config is as you describe and deploying new configs works.

I could solve the HTTP 404 problem by switching the command transports in the icingadb module. (First one Master-1, second one Master-2)
The overdue ping checks seemed to be checks that were already acknowledged but the second master didn’t know that. I got rid of them by removing the ack and re-acknowledging them.

krappi01 · July 29, 2022, 3:22pm

Now I found the issue in icinga2.conf on master2:

// Disabled by the node setup CLI command on 2022-07-27 16:15:03 +0200
//include_recursive "conf.d"

That also explains why my local defined notification commands weren’t available on the second master.
And the overdue checks seem to be conflicts with local configuration / director configuration.