I have to manually click on every check and “check now” in order to get it periodically working again. How could this happen and is there any command or job which could retrigger all checks?
My environment:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.14.5-1)
Copyright (c) 2012-2025 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
System information:
Platform: Ubuntu
Platform version: 22.04.5 LTS (Jammy Jellyfish)
Kernel: Linux
Kernel version: 5.15.0-131-generic
Architecture: x86_64
Build information:
Compiler: GNU 11.4.0
Build host: runner-hh8q3bz2-project-575-concurrent-0
OpenSSL version: OpenSSL 3.0.2 15 Mar 2022
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
Enabled features: api checker command graphite icingadb ido-mysql livestatus mainlog notification perfdata
Checking period is 24x7 at all services and hosts. I have tried dependency (parent - child) one time, but it never worked out. The problem seemed to be related with a reboot I think. Also services were broken where no dependency has ever been set.
I seem to remember I experienced a similar behaviour once when it turned out that the “volatile” flag had been inadvertently set for quite a few check templates!! (I think that’s where it’s kept in Icinga2; I currently don’t have access to a running instance to check, sorry.) It took us days to get to that little oversight in our configuration, and hours to then effectively switch it off everywhere.
Moral of the story (for us at the time): Do not set non-default values for attributes you don’t understand!
P.S. (late edit): If the checks are just late/pending, make sure it isn’t this problem! If you aren’t using dependencies at all, it shouldn’t apply, though. (Dependencies are a very useful feature, BTW! Just hard to understand… ) BTW, at the time, I had during this crisis even developed a workaround cron job script to regularly globally enforce checking, which of course was a crazily simplified workaround, ignoring all individual check scheduling…
/kb