Disable notification on some service if host is up

10RUPTiV · October 2, 2021, 8:58pm

Hey guys…

We are monitoring value from snmp from some appliance BUT sometimes, snmp stop responding on the appliance and we got notification 'cause the service is now “CRITICAL”.

Is there a way we can disable notification IF the host is still UP for those services ?

steaksauce · October 3, 2021, 5:13pm

In some plugins, there is a default or flag to make timeouts return as unknown – if that’s the case with your current plugin, you can filter out unknown on your notification rule.

Otherwise, it sounds like you would never want a problem notification (or at least critical), so perhaps you could filter out critical, or just skip problem notifications on the service all together.

Edit to add: you might see what is causing SNMP to timeout as well. In our environment is it usually due to packet loss (check_ping) or high resource utilization on the target. If it’s just randomly intermittent, you could also increase the timeout value (start low, by increasing 5 or 10 seconds)

10RUPTiV · October 4, 2021, 1:02pm

@steaksauce
We already told the appliance company about that, but in the mean time we need to do something

Currently we are using this:

object Service "Mail Queue [Active]" {
  host_name = "hostname.local"
  check_command = "snmp"

  vars.snmp_oid = ".1.3.6.1.4.1.2021.8.1.101.1"
  vars.snmp_community = "public"
  vars.snmp_warn = "30"
  vars.snmp_crit = "50"
}

and the result is

Output: CRITICAL - Plugin timed out while executing system call

The only option will be to create a custom snmp script to handle the timeout and make it Unknown ?

10RUPTiV · October 4, 2021, 1:12pm

I think this will solve the problem…

Looks like we need to specify the timeout value to return an unknown!