Child_Host are not shown as unreachable

Hello,

Icinga versions:
master(Ubuntu): r2.11.4-1
satellite(Ubuntu): r2.11.4-1
agent (Windows): v2.11.4-1

My Config:

master

/etc/icinga2/zones.conf

object Endpoint "monitoring.[...]" {
        host = "monitoring.[...]"
}

object Endpoint "monitoring-dev.[...]" {
        host = "monitoring-dev.[...]"
}

object Endpoint "<Name>" {
        log_duration = 0
}

object Zone "master" {
        endpoints = [ "monitoring.[...]" ]
}

object Zone "monitoring-dev.[...]" {
        endpoints = [ "monitoring-dev.[...]" ]
 parent = "master"
}

object Zone "<Name>" {
  endpoints = [ "<Name>" ]
  parent = "monitoring-dev.[...]" 
 
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true

Satellite:
zones.conf:

object Endpoint "monitoring.[...]" {
        host = "monitoring.[...]"
        port = "5665"
}

object Endpoint "<Name>" {
        host = "<Name>"
        log_duration = 0
}

object Endpoint "monitoring-dev.[...]" {
}

object Zone "master" {
        endpoints = [ "monitoring.[...]" ]
}

object Zone "monitoring-dev[...]" {
        endpoints = [ "monitoring-dev.[...]" ]
        parent = "master"
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true
}

Agent:
zones.conf:

object Endpoint "monitoring-dev.[...]" {
        host = "monitoring-dev.[...]"
        port = "5665"
}

object Endpoint "<Name>" {
}

object Zone "master" {
        endpoints = [ "monitoring-dev.[...]" ]
}

object Zone "<Name>" {
        endpoints = [ "<Name>" ]
        parent = "master"
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true
}

On master
Host

/etc/icinga2/zones.d/monitoring-dev[…]/hosts_linux.conf

object Host "<Name>" {
  import "generic-host"
  check_command = "hostalive"
  address = "IP-Address"
  vars.os = "Linux"
  vars.agent_endpoint = name
  vars.cluster_zone = "monitoring-dev.[...]"
  vars.notification["mail"] = {
    groups = [ "icingaadmins" ]
  }
}

Service

/etc/icinga2/zones.d/global-templates/satellite-services.conf

apply Service "Check Linux Disk" {
   import "disk-linux"
   check_command = "disk"
   command_endpoint = host.vars.agent_endpoint
   vars.cluster_zone = host.name
   assign where host.vars.cluster_zone == "monitoring-dev.[...]" &&   host.vars.agent_endpoint && host.vars.os == "Linux"
}

So the problem is now, when I stopping the Icinga2-Service on the Sattelite (or shutting down the Satellite), then the Service(-s) is/are sitll working.

1

The text “DISK OK - free space…” is something grayed out, but it still working and reachable. There is no Error-Status like “No connection to monitoring-dev[…] from …” on Icingaweb2.

Then I tryed it with Dependencies…

/etc/icinga2/zones.d/monitoring-dev[…]/parents.conf

apply Dependency "TEST" to Host {
  parent_host_name = "monitoring-dev.[...]"
  disable_checks = false
  disable_notifications = true
  ignore_soft_states = false
  states = [ Up, Down ]
  assign where host.name == "<Name>"

}


apply Service "ping4" {
  import "generic-service"
  check_command = "ping4"
  assign where host.address
}


apply Dependency "TEST" to Service {
  parent_host_name = "monitoring-dev.[...]"
  parent_service_name = "ping4"
  disable_checks = false
  states = [ OK ]

  assign where host.name == "<Name>"
}

… is nothing happens, no changes.
The idea behind this, is when the Satellite is DOWN, it should be that the alert show the status for the Agent and their Services as DOWN or UNKNOWN or like that too. But it’s show’s now the status OK.
Also when the Satellite is off, then the Service ping4 from Sateliite is CRITICAL and then it must show Agent (and their Services) as Critical / Unkwnown…
You cannot see clearly, that the Agent is unreachable now (if Satellite off).

I didn’t check through all your config’s but in our (Icinga 2.10) experience, checks that run on the Satellite cannot be monitored by the Master if the Satellite is off. Services checked by a Satellite are only ever updated by that Satellite (or another one in the same zone). The Master can only report what the Satellite has already reported to it.

So if you’re setting up dependencies on the Master, concerning the Satellite stuff, you can only make the Satellite checks dependent on the whole Satellite Icinga server, as viewed from the Master. And in any case, an activated dependency will just freeze the service in the state it is currently in - don’t know if that also affects the “unreachable” status.

(A colleague of mine read the above and has added:) The only option that I can see producing the behaviour you want is: Set up a server to serve as NRPE (or similar) host, have all checks run via NRPE on that host… once that NRPE host is unavailable, all the checks will be critical or unknown, depending on your settings.

(Me again:) So, rather use NRPE solution than Satellite…