Dependency doesn't work - still receive notifications

Hi

I’m following the icinga2 dependency to set up service to service dependency. By looking at the log, it seems the dependency is taking effect, but somehow we still get the notification regardless.

My configuration is very standard, it looks like below. Based on the debug log, I see the depedency state matched, and the reachability is updated to not reachable, but it still sends out the notification (the actual values in the config and logs are replaced with token so that not to disclose internal info):

object Dependency "{{dependency-obj-name}}" {
  parent_host_name = "{{parent_host_name}}"
  parent_service_name = "{{parent_service_name}}"

  child_host_name = "{{child_host_name}}"
  child_service_name = "{{child_service_name}}"

  states = [ Critical, Unknown ]
  ignore_soft_states = true
  disable_checks = false
  disable_notifications = true
}



[2020-05-26 16:53:00 -0700] debug/DbEvents: Updating reachability for checkable '{{child-host-name}}!{{child-service-name}}': not reachable.
[2020-05-26 16:53:25 -0700] debug/DbEvents: Updating reachability for checkable '{{child-host-name}}!{{child-service-name}}': not reachable.
[2020-05-26 16:53:34 -0700] debug/DbEvents: Updating reachability for checkable '{{child-host-name}}!{{child-service-name}}': not reachable.
[2020-05-26 16:53:58 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:53:58 -0700] debug/CheckerComponent: Executing check for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:53:58 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:53:58 -0700] debug/CheckerComponent: Check finished for object '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:53:58 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:53:58 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:53:58 -0700] debug/DbEvents: add log entry history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:53:58 -0700] debug/DbEvents: add checkable check history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:53:58 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:53:58 -0700] debug/DbEvents: add state change history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:53:58 -0700] notice/Checkable: State Change: Checkable '{{child-host-name}}!{{child-service-name}}' soft state change from CRITICAL to CRITICAL detected.
[2020-05-26 16:54:33 -0700] debug/DbEvents: Updating reachability for checkable '{{child-host-name}}!{{child-service-name}}': not reachable.
[2020-05-26 16:54:56 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:54:56 -0700] debug/CheckerComponent: Executing check for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:54:56 -0700] debug/CheckerComponent: Check finished for object '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:54:56 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add log entry history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] debug/DbEvents: add checkable check history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] notice/Dependency: Dependency '{{child-host-name}}!{{child-service-name}}!{{dependency-object-name}}' passed: Parent service '{{parent-host-name}}!{{parent-service-name}}' matches state filter.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add state change history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] notice/Checkable: State Change: Checkable '{{child-host-name}}!{{child-service-name}}' hard state change from CRITICAL to CRITICAL detected.
[2020-05-26 16:54:56 -0700] information/Checkable: Checking for configured notifications for object '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] debug/Checkable: Checkable '{{child-host-name}}!{{child-service-name}}' has 1 notification(s).
[2020-05-26 16:54:56 -0700] notice/Notification: Attempting to send  notifications for notification object '{{child-host-name}}!{{child-service-name}}!pagerduty-service'.
[2020-05-26 16:54:56 -0700] information/Notification: Sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for user 'slack-ops'
[2020-05-26 16:54:56 -0700] information/Notification: Sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for user 'ops-alerts'
[2020-05-26 16:54:56 -0700] information/Notification: Sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for user 'pagerduty_ops'
[2020-05-26 16:54:56 -0700] debug/DbEvents: add notification history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] debug/DbEvents: add contact notification history for service '{{child-host-name}}!{{child-service-name}}' and user 'slack-ops'.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add contact notification history for service '{{child-host-name}}!{{child-service-name}}' and user 'ops-alerts'.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add contact notification history for service '{{child-host-name}}!{{child-service-name}}' and user 'pagerduty_ops'.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add log entry history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] information/Notification: Completed sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for checkable '{{child-host-name}}!{{child-service-name}}' and user 'slack-ops'.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add log entry history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] information/Notification: Completed sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for checkable '{{child-host-name}}!{{child-service-name}}' and user 'ops-alerts'.
[2020-05-26 16:54:56 -0700] debug/DbEvents: add log entry history for '{{child-host-name}}!{{child-service-name}}'
[2020-05-26 16:54:56 -0700] information/Notification: Completed sending 'Problem' notification '{{child-host-name}}!{{child-service-name}}!pagerduty-service' for checkable '{{child-host-name}}!{{child-service-name}}' and user 'pagerduty_ops'.
[2020-05-26 16:55:31 -0700] debug/DbEvents: Updating reachability for checkable '{{child-host-name}}!{{child-service-name}}': not reachable.

I also experimented the following:

  1. changing to use apply Dependency to Service syntax
  2. tried different value of ignore_soft_states - I believe it should be set to true so that we don’t receive notification when the parent is in a soft states according to the documentation?

And despite the log says the reachability is updated correctly, it still sends out the notification.

Hello Zhiyu,
I hope your are doing well. I believe your dependency is not getting full applied because you are using both host and service attributes in your dependency object. Did you review the dependency chapter of the documentation? Using the apply rules with dependencies is a lot easier. Below is a example of a service to service dependency.

apply Dependency "agent-health-check" to Service {
  parent_service_name = "agent-health"

  states = [ OK ] // Fail if the parent service state switches to NOT-OK
  disable_notifications = true

  assign where host.vars.agent_endpoint // Automatically assigns all agent endpoint checks as child services on the matched host
  ignore where service.name == "agent-health" // Avoid a self reference from child to parent
}

Or with your code

apply Dependency "{dependency-obj-name)" to Service {
  parent_service_name = "{parent_service_name}"

  states = [ OK ] // Fail if the parent service state switches to NOT-OK
  disable_notifications = true

  assign where host.vars.agent_endpoint // Automatically assigns all agent endpoint checks as child services on the matched host
}

BTW - the debug log you included are hard to follow because the values in the code do not match up with the logs.

I hope this helps.
Alex

Hi Alex, thanks so much for your reply! I actually tried the apply syntax, but again I set both parent_host_name and parent_service_name, because what I need to achieve is to have the child service depend on the service on a specific parent host (which is different from the child host)

I tried something similar to:

apply Dependency "{dependency-obj-name)" to Service {
  parent_host_name = "{{parent_host_name}}"
  parent_service_name = "{parent_service_name}"

  states = [ OK ] // Fail if the parent service state switches to NOT-OK
  disable_notifications = true

  assign where service.name == "{child_service_name}" && host.vars.agent_endpoint
}

Regarding debug log, any idea what text I should be looking for to confirm if the dependency is applied correctly? I found log to indicate Updating reachability for checkable ‘{{child-host-name}}!{{child-service-name}}’: not reachable; but it still sends out the notification somehow.

Yea, the apply rule is should not be used for just one service to service dependency. Let go back to your original post. Change the state line as below. The logs say the service changed from a “Critical” state to a “Critical” state. The original post has “Critical” as one of the state when the dependency does not apply.

states = [ UP, OK ]

Regards
Alex

Ah, that helps, thanks @Alex! I think I was understanding the states incorrectly. So if I want to disable the notification when the parent service is in Critical/Unknown state, I should set the states to [OK, WARNING] instead?

@ZhiyuChen that is correct. If the service state is NOT [ “OK”, “Warning”] the dependency will trigger and notification will get disabled. Please review the dependencies chapter for more details.