Hi,
at the moment we have a icinga2 master-satellite-agent setup which is running fine so far. We are using icingaweb2 and director to configure and deploy commands and checks to all agents. We have 3 zones with 2 masters and 4 satellites. The masters are in the master zone, 2 satellites are in the production zone and 2 satellites are in the test zone. Most of our hosts are setup on debian buster which includes the master and satellites. The agent that has the problem is installed with centos.
Icinga master information:
icinga version:
root@icinga-master-1:~# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.2-1)
Copyright (c) 2012-2022 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
System information:
Platform: Debian GNU/Linux
Platform version: 10 (buster)
Kernel: Linux
Kernel version: 4.19.0-18-amd64
Architecture: x86_64
Build information:
Compiler: GNU 8.3.0
Build host: runner-hh8q3bz2-project-298-concurrent-0
OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
icingaweb + director infos:
Icinga Web 2 Version: 2.9.5
Git Commit: 053971c99dc1a4510beb64a888ea695cc14032dc
PHP-Version: 7.3.31-1~deb10u1
Git Commit Datum: 2021-11-18
Geladene Bibliotheken
Name | Version |
---|---|
icinga/icinga-php-library | 0.7.0 |
icinga/icinga-php-thirdparty | 0.10.0 |
Geladene Module
Name | Version |
---|---|
bayerisch | 1.0.0 |
businessprocess | 2.3.1 |
cube | 1.1.0 |
director | 1.8.0 |
doc | 2.9.5 |
fraenkisch | 1.0.0 |
grafana | 1.3.6 |
idoreports | 0.9.1 |
incubator | 0.6.0 |
ipl | v0.5.0 |
jira | 1.1.0 |
monitoring | 2.9.5 |
oesterreichisch | 1.0.0 |
pdfexport | 0.9.1 |
reactbundle | 0.9.0 |
reporting | 0.9.2 |
unicorn | 1.0.2 |
x509 | 1.0.0 |
icinga satellite version:
root@icinga-satellite-test-1:~# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.2-1)
Copyright (c) 2012-2022 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
System information:
Platform: Debian GNU/Linux
Platform version: 10 (buster)
Kernel: Linux
Kernel version: 4.19.0-18-amd64
Architecture: x86_64
Build information:
Compiler: GNU 8.3.0
Build host: runner-hh8q3bz2-project-298-concurrent-0
OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
One agent from the test zone got reinstalled a few weeks ago. Since the resinstallation the director zone wonāt sync on it while other agents work without any problems. I compared the agent config to another agent and i cannot find any issues. In this post i have to censor the names so i will call the ābrokenā agent āagent-problemā.
Information for agent-problem:
icinga version:
$ icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: 2.13.2-1)
Copyright (c) 2012-2022 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
System information:
Platform: Oracle Linux Server
Platform version: 8.3
Kernel: Linux
Kernel version: 4.18.0-240.22.1.el8_3.x86_64
Architecture: x86_64
Build information:
Compiler: GNU 8.4.1
Build host: runner-hh8q3bz2-project-322-concurrent-0
OpenSSL version: OpenSSL 1.1.1g FIPS 21 Apr 2020
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
os version:
$ cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="8.3"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="8.3"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Oracle Linux Server 8.3"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:8:3:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://bugzilla.oracle.com/"
ORACLE_BUGZILLA_PRODUCT="Oracle Linux 8"
ORACLE_BUGZILLA_PRODUCT_VERSION=8.3
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=8.3
/etc/icinga2/zones.conf:
object Endpoint NodeName {
}
object Endpoint "icinga-satellite-test-2" {
host = "icinga-satellite-test-2"
}
object Endpoint "icinga-satellite-test-1" {
host = "icinga-satellite-test-1"
}
object Zone "agent-problem" {
endpoints = [ "agent-problem", ]
parent = "test-satellite"
}
object Zone "director-global" {
global = true
}
object Zone "global-templates" {
global = true
}
object Zone "test-satellite" {
endpoints = [ "icinga-satellite-test-1", "icinga-satellite-test-2", ]
}
The communication and registration between the agent-problem and the satellites from the zone test-satellite is working properly.
At first we only noticed that commands are staged at unknown with the output
Check command ācheck_loadā does not exist.
We reforced a synchronisation of the zone by deleting /var/lib/icinga2/api/zones and zones-stage and restarting the service. After that we got following error with the zones-stage:
[2022-02-01 09:31:18 +0100] information/cli: Icinga application loader (version: 2.13.1-1)
[2022-02-01 09:31:18 +0100] information/cli: Loading configuration file(s).
[2022-02-01 09:31:18 +0100] information/ConfigItem: Committing config item(s).
[2022-02-01 09:31:18 +0100] information/ApiListener: My API identity: << agent-problem >>
[2022-02-01 09:31:18 +0100] critical/config: Error: Array iterator requires value to be an array.
Location: in /var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf: 681:1-681:61
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(679): }
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(680):
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(681): apply Service "SMART-Status " for (config in host.vars.disks) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(682): import "service-agent-template"
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(683): import "service-5min"
Context:
(0) Evaluating 'apply' rule (in /var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf: 681:1-681:61)
(1) Evaluating 'apply' rules for host '<< agent-problem >>'
[2022-02-01 09:31:18 +0100] critical/config: 1 error
[2022-02-01 09:31:18 +0100] critical/cli: Config validation failed. Re-run with 'icinga2 daemon -C' after fixing the config.
I have no explanation to this error and the validation on the master and everywhere else on other agents does not fail at all. I wanted to see if the sync will work if i remove allthose checks and without the checks following error appear:
[2022-02-07 12:17:27 +0100] critical/config: Error: Validation failed for object '<< agent-problem >>' of type 'Service'; Attribute 'command_endpoint': Checkable with command endpoint requires a zone. Please check the troubleshooting documentation.
Location: in /var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf: 13:1-13:19
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(11): }
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(12):
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(13): apply Service "RAM" {
^^^^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(14): import "service-agent-template"
/var/lib/icinga2/api/zones-stage//director-global/director/service_apply.conf(15): import "service-5min"
[2022-02-07 12:17:27 +0100] critical/config: 1 error
After those errors we deleted the agent from the director and tried to add it again but that did not work at all. We checked the configuration and compared it to working agents but we did not see any difference or errors at the time.
Right now we do not have any glues to figure out what the source of the problem is or how to recreate the problem on an other working agent.
Does anybody else have any idea what we could look for or what could be the error?
if you need further information or have other questions i will try to answer as good as i can.
Thanks in advance for your time and help