No reload/restart of Icinga2 possible

Hello,

we have the problem that when we change an config file for icinga2 in /etc/icinga2 and we wanna reload the service, we got on both master servers the following error for the service:

systemctl status icinga2.service
● icinga2.service - Icinga host/service/network monitoring system
     Loaded: loaded (/lib/systemd/system/icinga2.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/icinga2.service.d
             └─limits.conf
     Active: active (running) since Fri 2023-06-16 07:29:21 CEST; 8h ago
    Process: 3441206 ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/default/icinga2 (code=exited, status=0/SUCCESS)
    Process: 3509812 ExecReload=/usr/lib/icinga2/safe-reload /etc/default/icinga2 (code=exited, status=1/FAILURE)
   Main PID: 3441211 (icinga2)
     Status: "Config validation failed."
      Tasks: 63
     Memory: 253.6M
        CPU: 51min 27.295s
     CGroup: /system.slice/icinga2.service
             ├─3441211 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon --close-stdio -e /var/log/icinga2/error.log
             ├─3441286 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon --close-stdio -e /var/log/icinga2/error.log
             └─3441303 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon --close-stdio -e /var/log/icinga2/error.log

Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: [2023-06-16 13:56:01 +0200] warning/config: Ignoring directory '/etc/icinga2/zones.d/ic-sat-aws01' for unknown zone 'ic-sat-aws01'.
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: [2023-06-16 13:56:01 +0200] critical/config: Error: Object 'check_ca' of type 'CheckCommand' re-defined: in /var/lib/icinga2/api/zones/global-config/_etc/commands/ca_check.conf: 1:0-1:29; previous definition: in /etc/icinga2/zones.d/global-templates/commands/ca_check.conf: 1:0-1:29
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: Location: in /var/lib/icinga2/api/zones/global-config/_etc/commands/ca_check.conf: 1:0-1:29
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: /var/lib/icinga2/api/zones/global-config/_etc/commands/ca_check.conf(1): object CheckCommand "check_ca" {
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]:                                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: /var/lib/icinga2/api/zones/global-config/_etc/commands/ca_check.conf(2):     import "plugin-check-command"
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: /var/lib/icinga2/api/zones/global-config/_etc/commands/ca_check.conf(3):
Jun 16 13:56:01 pr-be-ic-mas-02 safe-reload[3509888]: [2023-06-16 13:56:01 +0200] critical/cli: Config validation failed. Re-run with 'icinga2 daemon -C' after fixing the config.
Jun 16 13:56:01 pr-be-ic-mas-02 systemd[1]: icinga2.service: Control process exited, code=exited, status=1/FAILURE
Jun 16 13:56:01 pr-be-ic-mas-02 systemd[1]: Reload failed for Icinga host/service/network monitoring system.

Does anyone know how this could be fixed?
Only a week ago we installed the second master and actived the ha sync for api:

/**
 * The API listener is used for distributed monitoring setups.
 */
object ApiListener "api" {
  accept_config = true
  accept_commands = true

  ticket_salt = TicketSalt
}
object ApiUser "client-pki-ticket" {
  password = "xxxxxxxxxxxxxxxx"
  permissions = [ "actions/generate-ticket" ]
}
object ApiUser "director" {
  password = "xxxxxxxxxxxxxxxx"
  permissions = [ "*" ]
}
object ApiUser "dashing" {
  password = "xxxxxxxxxxxxxxxx"
  permissions = [ "status/query", "objects/query/*" ]
}

Here is the actual zones.conf:

/*
 * Generated by Icinga 2 node setup commands
 * on 2018-01-10 17:39:07 +0100
 */
object Endpoint "icinga02.picturemaxx.net" {
  host = "xxxxxxxxxxxxxx"
}

object Endpoint "pr-be-ic-mas-02" {
  host = "xxxxxxxxxxx"
}

#object Endpoint "ic-sat-qsc01" {
#  host = "xxxxxxxxxxxx"
#  /* host = "xxxxxxxxxxx" */
#}

object Endpoint "ic-sat-qsc01-VM" {
  host = "xxxxxxxxxxxx"
}

#object Endpoint "icinga02-sat-off.picturemaxx.net" {
#  host = "xxxxxxxxxxxx"
#}

#object Endpoint "icinga02-sat-aws.picturemaxx.net" {
#  host = "xxxxxxxxxxxx"
#}

object Endpoint "ic-sat-aws-apne1-01" {
  host = "xxxxxxxxxxxx"
}

object Endpoint "ic-sat-aws-use1-01" {
  host = "xxxxxxxxxxx"
}

object Endpoint "ic-sat-aws-euw1-01" {
  host = "xxxxxxxxxxx"
}

object Endpoint "ic-sat-aws-euc1-01" {
  host = "xxxxxxxxxxx"
}

object Endpoint "pr-be-ic-sat-off-01" {
  host = "xxxxxxxxxxx"
}

object Zone "ic-mas-qsc01" {
  endpoints = [ "icinga02.picturemaxx.net", "pr-be-ic-mas-02" ]
}

object Zone "ic-sat-qsc01" {
  endpoints = [ "ic-sat-qsc01-VM" ]
  parent = "ic-mas-qsc01"
}

object Zone "ic-sat-off01" {
  endpoints = [ "pr-be-ic-sat-off-01" ]
  parent = "ic-mas-qsc01"
}

#object Zone "ic-sat-aws01" {
#  endpoints = [ "icinga02-sat-aws.picturemaxx.net" ]
#  parent = "ic-mas-qsc01"
#}

object Zone "ic-sat-aws-apne1-01" {
  endpoints = [ "ic-sat-aws-apne1-01" ]
  parent = "ic-mas-qsc01"
}

object Zone "ic-sat-aws-euw1-01" {
  endpoints = [ "ic-sat-aws-euw1-01" ]
  parent = "ic-mas-qsc01"
}

object Zone "ic-sat-aws-euc1-01" {
  endpoints = [ "ic-sat-aws-euc1-01" ]
  parent = "ic-mas-qsc01"
}

object Zone "ic-sat-aws-use1-01" {
  endpoints = [ "ic-sat-aws-use1-01" ]
  parent = "ic-mas-qsc01"
}

object Zone "global-config" {
        global = true
}

object Zone "global-templates" {
  global = true
}
  • Version used (icinga2 --version):
icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.7-1)

Copyright (c) 2012-2023 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Debian GNU/Linux
  Platform version: 11 (bullseye)
  Kernel: Linux
  Kernel version: 5.10.0-23-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 10.2.1
  Build host: runner-hh8q3bz2-project-575-concurrent-0
  OpenSSL version: OpenSSL 1.1.1n  15 Mar 2022

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
  • Operating System and version: Debian 11.6
  • Enabled features (icinga2 feature list):
Disabled features: compatlog debuglog elasticsearch gelf icingadb influxdb influxdb2 opentsdb perfdata statusdata syslog
Enabled features: api checker command graphite ido-mysql livestatus mainlog notification
  • Icinga Web 2 version and modules (System - About)
  • Config validation (icinga2 daemon -C)
  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes

Greetings
dcz01

You could try a clean up, however, I’m not sure if this is the root cause:

systemctl stop icinga2
rm -rf /var/lib/icinga2/api/{packages,zones,zones-stage}/*
systemctl start icinga2

Thanks for the idea with the clean up but it hasn’t worked.
The service can’t be started now and the error is still the same.

Now i tried it another time and then it worked with your remove/delete command but then all downtimes, hosts created over the api where also deleted.

So it worked now fine with that command:

rm -rf /var/lib/icinga2/api/{zones,zones-stage}/*

Thanks for the help.
Greetings
dcz01