Service Notification is send after dependency with parrent host is back OK

myska.ludek · March 11, 2020, 3:03pm

Hello
I have many host in tree order.
In Host i defined vars.parent = “his parrent name”
I apply dependecy Host To Host.

apply Dependency "Parent" to Host {
      parent_host_name = host.vars.parent
      assign where host.address && host.vars.parent
}

Ok so far.
Then i apply service to hosts.( many of them , ping , ssh , telnet , mdadm , … )
Then i apply Dependency Service To Host.

apply Dependency "Service" to Service {
      assign where host.address && host.vars.parent
 }

And then when host gets DOWN ( or any host in the parrent path ),
then service don’t send notification because the host is down.
This is expected behavior.

But after some time ( 1h for example ) when host get back UP, then
service sent notification CRITICAL state
and then after some time, check the service and send me notification
OK state.

Is there a way to do not send notification after parrent host is up ?
Nagios works like charm with this dependency notifications.

disable_checks = true
is not option for me, becasue then i never now the real state of service or host.

The icinga2web say reachable no but then icinga2 api say ‘“reachable”: true’

Output from icinga2 api will connected to out Information system.
I need know the real state of service but not read many of useless notifications.
So far we used nagios core. I want replace nagios with icinga2 but i need solve this notification problem.

Thanks

twidhalm · March 11, 2020, 4:39pm

Hi @myska.ludek and welcome to our community.

What version of Icinga 2 are you using? I think there was a change about this behaviour in 2.11 but I’m not completely sure.

disable_check might still be your best bet. Why would you want to check the objects when they are not reachable anyway? Are you afraid of Dependecies triggering because of configuration failures?

Please make sure to check parents in a shorter interval than the dependent hosts. There’s a calculation which can help you with prohibiting that you get an alarm from a dependent host before the parent is in a hard state. I know, that’s not the problem you just told us, but it might become one later.

myska.ludek · March 12, 2020, 7:07am

HI

Im runing icinga2 on debian buster version 2.11.3

We are running information system where information about devices configuration/properties are stored.
From this IS is icinga2 confguration generated. The IS show status of devices, group of devices, whole branch offices and core network and many own custom view to network status.
If disable_check = true then device via api show reachable true and IS show them as OK.
But the device is not OK

If disabled_check = false then for me everthings works fine, except lots of Notification when router on the path gets down and then up.

Thans

twidhalm · March 12, 2020, 8:11am

Ok, I see. Yes, that’s still a problem in Icinga that we don’t show in the status of dependent objects if they are still ok or only unreachable. Maybe that will get better with an upcoming version.

Do you get the lots of notifications only when the router comes back online again or also when it goes offline?

myska.ludek · March 13, 2020, 2:41pm

Notifications are send, when router become UP from DOWN state.
Solution will be add this option to icinga2

when the dependency of parent chain are restored then check the service or host and do not send notificitation immediately after depedency is restored.
( nagios do it this way )

I lookup how our information system ( from icinga2 config is generated and state back reported ) works and disable_check is not solution for me. Real connection path and path in config is not alway the same. I now this looks strange but for example when bunch of admins works and then sombody change network path to branch office over temporary connection then router in configured path gets down. Notification on devices on branch office do not send notification when something get wrong, but in overview of office is still visible what is ok and what is not. With disable_check = true this cannot work.Router will be DOWN and check will be suspended. Add agent to branch office is sometime not possible.
So devices must be checked from central point. I now this is far from ideal but it is tested and works for years now.

Right now we use nagios-core for monitoring. I find a icinga2 as excelent software with many interesting functions ( agents , satelite, api , apply statement… and many others ) and mi goal is replace nagios-core with icinga2.But without notification workaround i must stuck with nagios core.

mprudek · March 13, 2021, 4:18pm

Hi,

I have exactly the same problem with the same Icinga2 version. When I have two hosts in dependency chain and both of them go down, only the notification about the first host is sent - as expected. But when both the hosts go up - in the same same - and the check is done first to the first host, notification about the second host being down is immediately sent. And, for sure, followed by a second notification that it is up again in seconds (as soon as the check is done to the second host).

disable_check is not an option. (Because there could be “hidden” paths and the second host could be in fact UP - ospf is involved)

This behaviour is annoying since we have config not only with two hosts but with thousands of hosts in semi-tree structure.

Can somebody confirm this behaviour and give a me hint whether it is intended or will be fixed?

Thank you so much!

P.S. Sending a notification only right after a check would be a nice solution!

mprudek · March 22, 2021, 10:34am

Really nobody has hosts connected in dependency chain and is facing the same problem?

Or is there anybody with hosts connected like this but without any problem?

mprudek · August 18, 2021, 3:35pm

Ping, still nobody crossed this issue?

steaksauce · August 18, 2021, 10:51pm

We don’t use parenting (we have a national network that the “parent” could change for because of OSPF/BGP), but if you don’t want to be notified when a host comes UP, then you can modify your notifications to only include the states that you care about. (ie, critical/warning/unknown for services and down for hosts, or just one of those options, or any combination of the options).

The ITL config file would look something like this (pulled from our live environment that is configured via director):

template Notification "Service-Notify" {
    command = "notify-by-email2"
    interval = 0s
    states = [ Critical, Unknown, Warning ]
    types = [ Acknowledgement, Custom, Problem ]
    vars.body = "*** Icinga *** example alert with some vars perhaps"
    vars.from = "noreply@example.net"
    vars.subject = "Icinga: $service.state$: $service.display_name$ alert for $host.name$!"
    vars.@to = "$user.email$"
}