I have a passive service that has a check_interval at 2h
I am using a passive service to send disk space used in CRON every 2hrs, 0000, 0200, 0400 etc.
The service is going purple/stale almost exactly at the same time the new data is published.
I read the manual and it says the execution time is used in calculation. With this being a passive test I assume this means the execution time is 0 and it is exactly like it says, 2hrs exactly.
I have resolved this by changing the check interval to 2h1m to allow for some wiggle room.
Are there any suggestions on dealing with this rather comedic situation?
the execution time is not really relevant here if it is near zero
but icinga gets a result and schedules the next check for “in 2h” your cronjob will hardly arrive in time. setting the check interval to a higher time than your cron in important. one thing that wouldn’t hurt is to set it to 2h5m
But you can also do something like this in reference to your 2h window:
check_interval icinga 2h
check_attempts icinga =1
interval CRON 1h
icinga will check every 2 hours if there is problem
your checkresult gets submitted every hour and resets the clock for icinga to check again (in 2 hours)
I think this approach I have chosen of having the intervals be very exact to the passive intervals is unwise as you have shown.
As the passive update is very low overheads, sending an update every 1hr with a 2h5m check interval would allow for 2 failed updates before the state changes.