Initiate/force checks on satellites

I’ve service configuration for various services based on the active and inactive datacenters ( defined in host var) . These nrpe checks have check periods like 5 or 15 minutes. At times, we change the config to change’swap the active/inactive datacenters. Whenever that is done, some checks goes into either “Pending” or “UNKNOWN” state until a force check is initiated on them ( they do clear right away though ) else the checks will take their time > 15 mins to clear themselves (Ok state ) after that. The check period is playing a role but I’m not sure about it. I wonder if there is any trick to make satellites enforce the check right after taking over?

Please advise,
Thanks

Hi @monigacom,
to resolve that we would need some information about your setup, especially how many Icinga 2 nodes (running icinga2 instances) are there in your setup and how they are connected/configured.

From your description, it sounds like the satellites might not have the checker feature enabled (the part of icinga2 which schedules checks) or something similar.

You should find the most relevant information in the /etc/icinga2/zones.conf file.

This is a 2-tier setup with multiple zones ( DC# ) with 2 satellites in each zone. All satellites have “checker” feature enabled. The number of hosts in each zone is about 500 and all run “nrpe”. This problem show up on all zones. So whenever a DC is defined as “active”, quite a few services gets moved over to this zone from the previously active DC. So basically moved or rather instantiated across the satellites. As I said before, the checks do get into “OK” state eventually but can take > 15 mins unless I do “force check”. The manual “force check” and wait of > 15mins, I want to avoid.

Thanks

How do you move the hosts from one zone to another?
Do you simply move the corresponding .conf files form zones.d/zone-a to zones.d/zone-b on the master?

I guess you are reloading the master icinga2 service after your config move?

You could have a look at the API action for rescheduling checks:
https://icinga.com/docs/icinga-2/latest/doc/12-icinga2-api/#reschedule-check
and trigger a API call after you have done your move&reload

1 Like

There is location var defined for each host and site.conf has the definition for active DC. So once a DC is made active, site.conf is updated accordingly and icinga2 service is restarted.
“reschedule-check” is the option currently being used by choosing the host/services in Icingaweb2 UI. I have to try out with an API call with proper filters. Can that API call be triggered in some automated way after move/reload?

Also I’m wondering if the “reschedule-check” can be setup in some config file to trigger it automatically on the required hosts/services after reload? There is no Director being used.

Thanks

Can you share the configs and where they are placed?
Simply from your description I don’t really understand what you are doing.