Dependency in Director

Hello All

I read through the document in
https://icinga.com/docs/icinga2/latest/doc/03-monitoring-basics/#dependencies

I have a integration with ServiceNow Event Management which pulls events from icinga using host / service API

image

User Case1:
When the ping down alert is raised all the corresponding alert is also raised - even though it says handled - but these alerts do hit servicenow event management via API.

I want to stop these Unknown Alerts completely from being raised
Is it possible?
I created template.


I created the below dependency rule

I am not sure if I am doing it right

User Case 2
When the Postgres connection is down the other alerts fire for other KPIs
The DB check PG_DatabaseConn (Single Service) becomes red as critical. Other Service PG_DeadLock and PG_PreparedTransactions becomes Unknown. We want to supress the unknowns
What I was trying was as below

Honestly my concepts are not clear. I know I have to supress the unknown alerts from being raised. But not sure if I am doing it right

Services of a host have an (automatic) implicit dependency to their host.

  • Host A
    – Service A
    – Service B
    – …

If Host A goes down there won’t be any notification for Service A, B, …
If you got notifications for the services I would check if the services went “UNKNOWN” before the host was DOWN (check the timestamps in the history).

For a dependency you have three options: Host-to-Host, Host-to-Service, Service-to-Service.
You will always need to define a Parent Host, that is/can be the root of a problem.
For Service-to-Service you need to define a Parent Service as well.

In the States field you define all the states that let the dependency be “true”. Meaning that if you put states = [ UP ] the dependency will fail as soon as the host is in a non-UP state and then will disable notifications (and/or checks).

Imo the link page from the docs explains that all pretty good :slight_smile:

Hello

Thanks for the explanation.

Use Case 1:
I will test again by bringing a VM down and check what happens as current ones seems to be little confusing because of multiple up and down scenario.

Use Case 2:
Thanks for the great explanation. I want to do Service-to-Service dependency on all servers which belongs to postgres host.group. When the postgres DB is down (connection-time service is critical) we need to suppress the other alerts only related to postgres (not related to linux) not sure how to define that…

Here is the dependency I created as I understood. I have mentioned the Parent Service name of the single service for connection-time. If it is critical I do not want other alerts to be raised. I have also reduced the sampling interval of connection-time service

Before I got to postgres team and ask them to test - i just want to make sure I am not going to look absolute stupid :slight_smile:

If I understand correctly you have one central DB hosts with the connection-time service.
If this service is down, DB services at other hosts should not raise alerts.

If so, You need to define the Parent Host as well.

Or is the setup more like that you have multiple DB hosts that have the connection-time check and other DB checks?
Then you would need something like this:

apply Dependency "disable-db-checks" to Service {
  parent_service_name = "connection-time"

  assign where service.check_command == "db-check-command"
  ignore where service.name == "connection-time"
}

Instead of assign where service.check_command == "db-check-command" you could also use assign where service.name == "DB-check_name-schema_*"

This example comes from the official docs (https://icinga.com/docs/icinga2/latest/doc/09-object-types/#dependency) “Service-to-Service-on-the-same-Host Dependency”

Correct. This is my setup. Simple setup not complex :slight_smile:

The below template is what I created

template Dependency "tmplDependency-PG-Conn" {
    disable_checks = true
    disable_notifications = true
    period = "Always-24-7"
}

This is the apply rule

apply Dependency "PG_DependencyConnectionSupress" to Service {
    import "tmplDependency-PG-Conn"

    assign where service.check_command == "cPostgres" && service.display_name != "PG_DatabaseConn"
    parent_service_name = "PG_DatabaseConn"
    states = [ OK ]
}

My postgres command is cPostgres and the service which is the parent service is PG_DatabaseConn. I want to disable all other postges checks when PG_DatabaseConn service is firing. If it is not firing then it can fire other individual postgres service on the same server

I couldn’t find the ignore option in director to enable this
ignore where service.name == "agent-health"
so is this ok to do this
&& service.display_name != "PG_DatabaseConn"

I am getting there and understanding the concept :slight_smile:

We tested - Not working

I think I have done some mistake. It should have worked - but not sure what is the issue :frowning:

Hello

Thanks to support I got it working

template Dependency "XXX-tmplDependency-PG-Conn" {
    disable_checks = true
    disable_notifications = true
    ignore_soft_states = false
    period = "Always-24-7"
}

apply Dependency "XXX-P_PG_DependencyConnectionSupress" to Service {
    import "XXX-tmplDependency-PG-Conn"
    assign where service.check_command == "cPostgres" && !(service.display_name == "XXX-P_PG_DatabaseConn")
    parent_service_name = "XXX-P_PG_DatabaseConn"
    states = [ OK ]
}

Now another Dependency I want to create. From the documentation I understand that there is a implicit dependency from host to service already there only to stop notification. But the checks still gets executed. I want to create a dependency as in the example in documents - but not sure how to manage it in Director

I created the following dependency template:

How the assign rule will look in director - I created this. Is it correct ?

It becomes

apply Dependency "XXX-P_All_DependencyPingSupress" to Service {
    import "XXX-tmplDependency-All-NodePing"
    disable_checks = true
    assign where host.name
}

Not sure how to get just

assign where true

My guess: This should work as well.
I would simply test it and see if the checks are disabled if the host is down.

Hello

It does disable the checks :star_struck:-
But before the Ping Down fires (soft state) some service alerts becomes Informational (soft-state) saying that “Agent not able to connect to satellite”.

These active alerts goes as OK events in ServiceNow :frowning:.