Satellite does not run checks (yet another topic)

Hello,
I’m sorry to ask about this topic again. I know that a lot has been written on this (I’ve read most of it), but still i can’t get this set-up to work. Please help. :sob: I’m able to sync configs from the master to satellite and agent but checks are not run (pending).

We have a setup of master/agents in master zone which works ok.
Now, the client saw how cool their monitoring system is and they decided to add one older system into the mix for monitoring. This system is not visible from the master host (mt1). Need to ssh mt1->dm1->vm1; where vm1 is one machine in the “old system”.

I’m trying to do a top-down setup, same how it’s done for master/agents (where i define hosts in EndPoint objects only on the master mt1 machine).

The master/satellite/agent setup (mt1/dm1/vm1).

Agent (vm1) zones.conf:

object Endpoint "mt1" {
}
object Endpoint "dm1" {
}
object Endpoint "vm1" {
}
object Zone "master" {
  endpoints = ["mt1"]
}
object Zone "dm1" {
  endpoints = ["dm1"]
  parent = "master"
}
object Zone "vm1" {
  parent = "dm1"
  endpoints = ["vm1"]
}
object Zone "global-templates" {
  global = true
}

Satellite (dm1) zones.conf:

object Endpoint "dm1" {
}
object Endpoint "mt1" {
}
object Zone "master" {
  endpoints = [ "mt1", ]
}
object Zone "dm1" {
  endpoints = [ "dm1", ]
  parent = "master"
}
object Zone "global-templates" {
  global = true
}
object Endpoint "vm1" {
  host = "192.168.180.31"
}
object Zone "vm1" {
  endpoints = [ "vm1", ]
  parent = "dm1"
}

Master (mt1) zones.conf:

...
object Endpoint "dm1" {
  host = "10.5.25.77"
}
object Zone "dm1" {
  endpoints = [ "dm1", ]
  parent = "master"
}
...

Master (mt1) zones.d/dm1/vm1.conf:

object Endpoint "vm1" {
//  host = "192.168.180.31"
}
object Zone "vm1" {
  endpoints = ["vm1"]
  parent = "dm1"
}
object Host "vm1" {
  address = "192.168.180.31"
  display_name = "vm1"
  check_command = "hostalive"
  vars.name = "vm1"
//  command_endpoint = "vm1"
  vars.os = "Linux"
  vars.disks["disk"] = {}
  vars.disks["disk /"] = {
    disk_partitions = "/"
  }
  vars.agent = true
 
  //zone = "dm1"
}
object Service "disk-sat" {
  host_name = "vm1"

  check_command = "disk"
}

With this config I’m able to sync global-templates from mt1 to vm1 but checks don’t run on vm1.
Thank you very much for your help!

br
mm

Hi,

your configuration doesn’t use command endpoint checks for running the check on the satellite in the zone dm1. In order to make this happen, modify the configuration like this:

object Host "vm1" {
   ..

  vars.disks["all"] = {}
  vars.disks["/"] = { disk_partitions = "/" }

  vars.agent_endpoint = name
}

Then move the service apply rule into a separated file, e.g. zones.d/vm1/services.conf.

apply Service "disk" for (disk => config in host.vars.disks) { // loop over the `disks` host dictionary
  check_command = "disk"

  command_endpoint = host.vars.agent_endpoint

  vars += config // inherit disk_partitions if set

  assign where host.vars.agent_endpoint && host.vars.os == "Linux"
}

Cheers,
Michael

Hi!
Thanks for reply. Unfortunately it didn’t help. I can see new services in icingaweb when i change/add them in mt1-a config. But all are pending. I have restarted icinga on all three machines.

br
mm

from log on vm1

[2019-12-05 13:20:52 +0000] information/ApiListener: Applying config update from endpoint ‘dm1’ of zone ‘dm1’.
[2019-12-05 13:20:52 +0000] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/global-templates//.timestamp
[2019-12-05 13:20:52 +0000] information/ApiListener: Applying configuration file update for path ‘/var/lib/icinga2/api/zones/global-templates’ (17 Bytes). Received timestamp ‘2019-12-05 13:20:50 +0000’ (1575552050.417037), Current timestamp ‘2019-12-05 13:13:30 +0000’ (1575551610.102356).

Hi,

I almost forgot - please add the output of icinga2 --version of all involved nodes.

Cheers,
Michael

Hi,
here is the output, vm1 is older version.

mt1:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.4-1)

Copyright (c) 2012-2019 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: CentOS Linux
  Platform version: 7 (Core)
  Kernel: Linux
  Kernel version: 3.10.0-862.11.6.el7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.5
  Build host: unknown

dm1:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.4-1)

Copyright (c) 2012-2019 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: CentOS Linux
  Platform version: 7 (Core)
  Kernel: Linux
  Kernel version: 3.10.0-862.11.6.el7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.5
  Build host: unknown

vm1
icinga2 - The Icinga 2 network monitoring daemon (version: r2.7.2-1)

Copyright (c) 2012-2017 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Kernel: Linux
  Kernel version: 2.6.32-696.16.1.el6.i686
  Architecture: i686

Build information:
  Compiler: GNU 4.8.2
  Build host: 2118771359a9

Hi,

the agent dm1 is quite old, is it actually connected to the satellite vm1? Can you extract the service from the satellite via REST API?

This involves having an ApiUser object on the master/satellite (you can distribute it via config sync), and then querying the REST API at /v1/objects/services like shown in this example.

Attributes such as last_check, last_check_result, etc. are important to compare on both the master and the satellite.

Also, are other checks working on the satellite, and is its checker feature enabled?

Cheers,
Michael

@dnsmichi: Due to my knowledge it’s not allowed to have zone and endpoint object there (at least for 2.11.0). Am I right? Has this been changed by the bugfix releases?

With an indirectly connected agent from the master’s view, also being a command_endpoint and not syncing anything for this specific zone, this is allowed and also works.

The config sample I suggested works in my environments, and as such, I am wondering about its current runtime state on the master/satellite. I suspect that for some reason, either the satellite feels not responsible for the object, or the agent is not connected to the satellite.

Also, 2.10.x is older and has its own problems. My last resort is recommending to upgrade to 2.11.2. But first I’d like to see a full analysis from the environment.

Cheers,
Michael

Hi,
I can’t work on the system this week. I will come back next week.
Thanks for the help so far!

br
mm