Passive Check Behavior

Hi all, I have a passive check configured where I’m getting unexpected behavior after service state != OK. Basically, I have a cron job that runs daily @ 02:00 and then sends a process-check-result. I want the freshness check to run between 9:00-9:15, where if no update in the past 24 hours then change to UNKNOWN. What I am seeing with the following code is cron updates service as WARNING fine, then almos 6 hours later freshness kicks in and changes to UNKNOWN. Any thoughts on what I’m doing wrong?

template Service "generic-service" {
  max_check_attempts = 5
  check_interval = 1m
  retry_interval = 30s
  enable_perfdata = false
}

object TimePeriod "0900to0915" {
  ranges = {
    "monday" 	= "09:00-09:15"
    "tuesday" 	= "09:00-09:15"
    "wednesday" = "09:00-09:15"
    "thursday" 	= "09:00-09:15"
    "friday" 	= "09:00-09:15"
    "saturday" 	= "09:00-09:15"
    "sunday" 	= "09:00-09:15"
  }
}

apply Service "test_service" {
  import "generic-service"
  check_command = "dummy"
  
  enable_active_checks = true
  enable_passive_checks = true
  
  check_interval = 24h
  max_check_attempts = 1
  check_period = "0900to0915"
  
  vars.dummy_state = 3
  vars.dummy_text = {{
    return "No check results received."
  }}
}

thats the return code for unknown

between 9 and 9.15 the active check dummy will run and and in your case returns 3 (unknown) every minute so about 14 times in this interval

In your case I would do something like this:
https://icinga.com/docs/icinga-2/latest/doc/08-advanced-topics/#check-result-freshness
and adapt the returncode based on the check-age

Thanks for the response, but I actually want it to return UNKNOWN if freshness fails. I’m experiencing a few things I don’t understand:

  1. status changing to UNKNOWN well before the check_period
  2. status changing to unknown when check_interval hasn’t even come close

Maybe Icinga can’t do what I’m wanting it to do, but essentially when a result is sent to Icinga, the check_interval clock starts. When check_period is reached, then have freshness check execute dummy only if a result hasn’t been posted within the check_interval.

Did you send a TTL and mess up the freshness by doing so?
Sorry, scrap that but keep it in mind as it can mess up your scheduling!

@moreamazingnick is right, it will not work as you think.
you need to calculate vars.dummy_state via DSL this code from the link above could give you some ideas:

{{
    var service = get_service(macro("$host.name$"), macro("$service.name$"))
    var lastCheck = DateTime(service.last_check).to_string()

    return "No check results received. Last result time: " + lastCheck
  }}

This code is for vars.dummy_text but you can change it to fit for vars.dummy_state by calculating if the service.last_check is longer ago then now - 7h 15min then return the number 3 or if newer, return the number 0.

Icinga has a build in programming language - I highly doubt it can’t do what you want but maybe not the way you want :wink:

Well, the following day we didn’t experience the same issue, even though the status and results were sent the same way. The ongoing theory is that it was due to retry_interval, so I’ve set it to 24h as well.

if still you have set check attempts to “5” you need 5 cycles to go to a hard state.
if you set the retry interval to 24h this would mean you will recognise a error after like 5 days or so

check attempt 1 → Softstate critical
passive checkresult → hardstate OK
24h
check attempt 1 → Softstate critical
24h
check attempt 2 → Softstate critical
24h
check attempt 3 → Softstate critical
24h
check attempt 4 → Softstate critical
24h
check attempt 5 → Hardstate critical → notification

Thanks for the reply. I’m actually overriding the default from the template with max_check_attempts = 1 in the service.