Icinga2: service checks stopped

Hi,

I noticed that many of my service checks, which are done every 2 or 5 minutes have completely stopped working:

I have to manually click on every check and “check now” in order to get it periodically working again. How could this happen and is there any command or job which could retrigger all checks?

My environment:

icinga2 - The Icinga 2 network monitoring daemon (version: r2.14.5-1)

Copyright (c) 2012-2025 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Ubuntu
  Platform version: 22.04.5 LTS (Jammy Jellyfish)
  Kernel: Linux
  Kernel version: 5.15.0-131-generic
  Architecture: x86_64

Build information:
  Compiler: GNU 11.4.0
  Build host: runner-hh8q3bz2-project-575-concurrent-0
  OpenSSL version: OpenSSL 3.0.2 15 Mar 2022

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid

Enabled features: api checker command graphite icingadb ido-mysql livestatus mainlog notification perfdata

System health is green:

The most checks, which didn´t work stopped at a similar time and date. Has anyone a hint?

Hi @meisterpropper,
Is this a single node setup (meaning, do you have more than one Icinga2 instance and are they connected or not)?

One Icinga2 instance.

Curious, the checker feature is enabled, that would have been my first candidate.
Are you using Zones in your configuration much?

there are only 3 zones, but I am using only one - the icinga server itself.

Did you disable “Active Checks” for these Services?

No, all service and host presets have active checks.

Two things that come to my mind.

Do you use Dependency? In earlier Icinga versions I had such delays in auto checking when using Dependency.

What is your check_period? Anything other than “24x7”?

Checking period is 24x7 at all services and hosts. I have tried dependency (parent - child) one time, but it never worked out. The problem seemed to be related with a reboot I think. Also services were broken where no dependency has ever been set.

I seem to remember I experienced a similar behaviour once when it turned out that the “volatile” flag had been inadvertently set for quite a few check templates!! (I think that’s where it’s kept in Icinga2; I currently don’t have access to a running instance to check, sorry.) It took us days to get to that little oversight in our configuration, and hours to then effectively switch it off everywhere.

Moral of the story (for us at the time): Do not set non-default values for attributes you don’t understand!

P.S. (late edit): If the checks are just late/pending, make sure it isn’t this problem! If you aren’t using dependencies at all, it shouldn’t apply, though. (Dependencies are a very useful feature, BTW! Just hard to understand… :yum:) BTW, at the time, I had during this crisis even developed a workaround cron job script to regularly globally enforce checking, which of course was a crazily simplified workaround, ignoring all individual check scheduling… :stuck_out_tongue_closed_eyes:
/kb

One more thing to check:
Is the system date and time correct?

A wrong config there, especially when it changes while Icinga was running, can also lead to inconsistencies with check execution.

Yes, system time is up to date and ntp is working correctly without any firewall problems.