I hope I’m posting to the correct section about this, and provide enough info.
I’m being driven mad with my monitoring client, aNAG for Android, notifying me when a monitored service experiences a soft state fail. i.e. check 1 (out of 5 in my case) is above the warning or critical threshold that I’ve set. I do not get notification emails from Icinga2 when this happens - it is only aNAG that notifies me.
I can’t figure out if it is a bug in aNAG, or an error in the configuration I’ve used for the API user I’ve set up for aNAG, a configuration error I’ve made in aNAG, or some other configuration error I might have made in icinga2.
These soft failures are visible in Icinga2 Web. They are real events that Icinga2 correctly notices and shows if you happen to be looking. But they last only very briefly and I do not want to be warned about them through aNAG. That’s what soft states are for, right? If we reach the number of checks I’ve set (5) THEN I expect a hard fail and to be notified. But not before.
aNAG has a specific setting “Don’t fetch soft state services” which is normally ticked. I’ve tried ticking and unticking and vice versa, but it makes no difference.
I’m using 2.6.3 on Centos 7.
Please can someone suggest some configuration element I might look at?
In api-users.conf, for the user I’ve created specifically for aNAG, after filtering only the hosts and services I want visible based on the value of host.vars.anaghost and service.vars.anagservice, I only have:
permission = “actions/*”
Should there be more? Or less, maybe? Or maybe this has nothing to do with my problem?
In my service template, I have:
max_check_attempts = 5
check_interval = 1m
retry_interval = 30s
I believe there’s a “volatile” setting that can cause a problem like I’m experiencing, but this is definitely not the case in my config. I have never knowingly set it, and grepping the config can not find the word at all.
That’s all I can think of to offer as terms of my configuration, and maybe this would give someone an indication of where I’ve not looked or things I’ve not thought of changing?
I’ve tried searching for aNAG and for “softfail” “soft fail” and “soft state” to no avail.