Distributed config with check_by_ssh

Hello,

First of all forgive me if this is answered in another part of the forum or in the documentation. I’ve been reading about it for days both in the documentation and on the internet and I can’t find the problem.

I want to run a distributed icinga with a master and two satellites. As far as I can see the configurations are propagated to the satellites, but the checks are not. Right now I am satisfied with the ping check being done from the corresponding satellites, but in the future I would like to do checks with the check_by_ssh like what I have now working (I start from a functional configuration on a single server. ).

I send the configuration I have to see if you can help me.

  • Version used (icinga2 --version) MASTER AND SATELLITES
click arrow to open details
icinga2 - The Icinga 2 network monitoring daemon (version: 2.13.2-1)

Copyright (c) 2012-2023 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Red Hat Enterprise Linux
  Platform version: 8.7 (Ootpa)
  Kernel: Linux
  Kernel version: 4.18.0-425.19.2.el8_7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 8.4.1
  Build host: runner-hh8q3bz2-project-322-concurrent-0
  OpenSSL version: OpenSSL 1.1.1k  FIPS 25 Mar 2021

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
  • Enabled features (icinga2 feature list) MASTER Features
click arrow to open details
icinga2 feature list
Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb2 livestatus opentsdb perfdata statusdata syslog
Enabled features: api checker icingadb ido-mysql influxdb mainlog notification
  • Enabled features (icinga2 feature list) SATELLITE Features
click arrow to open details
icinga2 feature list
Disabled features: command compatlog debuglog elasticsearch gelf graphite icingadb influxdb influxdb2 livestatus notification opentsdb perfdata statusdata syslog
Enabled features: api checker mainlog
  • Config validation (icinga2 daemon -C) MASTER
click arrow to open details
icinga2 daemon -C ; systemctl restart icinga2
[2023-06-01 16:36:52 +0200] information/cli: Icinga application loader (version: 2.13.2-1)
[2023-06-01 16:36:52 +0200] information/cli: Loading configuration file(s).
[2023-06-01 16:36:52 +0200] information/ConfigItem: Committing config item(s).
[2023-06-01 16:36:52 +0200] information/ApiListener: My API identity: srvml4icimase01.mutua.es
[2023-06-01 16:36:52 +0200] warning/ApplyRule: Apply rule 'agent-health-check' (in /etc/icinga2/zones.d/yecora/health.conf: 8:1-8:48) for type 'Dependency' does not match anywhere!
[2023-06-01 16:36:52 +0200] warning/ApplyRule: Apply rule 'agent-health' (in /etc/icinga2/zones.d/yecora/health.conf: 1:0-1:27) for type 'Service' does not match anywhere!
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 InfluxdbWriter.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 User.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 UserGroup.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 10 ServiceGroups.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 3 TimePeriods.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 306 Services.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 9 Zones.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 33 ScheduledDowntimes.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 2 NotificationCommands.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 100 HostGroups.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 325 Notifications.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 33 Downtimes.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 16 Dependencies.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 19 Hosts.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 3 Endpoints.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 IcingaDB.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 ApiUser.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 260 CheckCommands.
[2023-06-01 16:36:52 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2023-06-01 16:36:52 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-06-01 16:36:52 +0200] information/cli: Finished validating the configuration file(s).
  • Config validation (icinga2 daemon -C) SATELLITES
click arrow to open details
icinga2 daemon -C ; systemctl restart icinga2
[2023-06-01 16:37:27 +0200] information/cli: Icinga application loader (version: 2.13.2-1)
[2023-06-01 16:37:27 +0200] information/cli: Loading configuration file(s).
[2023-06-01 16:37:27 +0200] information/ConfigItem: Committing config item(s).
[2023-06-01 16:37:27 +0200] information/ApiListener: My API identity: srvml1icisate01.mutua.es
[2023-06-01 16:37:27 +0200] warning/ApplyRule: Apply rule 'agent-health-check' (in /var/lib/icinga2/api/zones/alcala/_etc/health.conf: 10:1-10:48) for type 'Dependency' does not match anywhere!
[2023-06-01 16:37:27 +0200] warning/ApplyRule: Apply rule 'agent-health' (in /var/lib/icinga2/api/zones/alcala/_etc/health.conf: 1:0-1:27) for type 'Service' does not match anywhere!
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 4 Zones.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 2 Endpoints.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 259 CheckCommands.
[2023-06-01 16:37:27 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2023-06-01 16:37:27 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-06-01 16:37:27 +0200] information/cli: Finished validating the configuration file(s).
  • MASTER zones.conf
click arrow to open details
object Endpoint "CPD4icimase01.example.es" { }
object Endpoint "CPD1icisate01.example.es" {host = "CPD1icisate01.example.es" }
object Endpoint "CPD2icisate01.example.es" {host = "CPD2icisate01.example.es" }

object Zone "master" {
        endpoints = [ "CPD4icimase01.example.es" ]
}
object Zone "global-templates" {
        global = true
}
object Zone "director-global" {
        global = true
}
object Zone "alcala" {
        endpoints = [ "CPD1icisate01.example.es" ]
        parent = "master"
}
object Zone "yecora" {
        endpoints = [ "CPD2icisate01.example.es" ]
        parent = "master"
}
  • SATELLITE CPD1icisate01.example.es zone.conf
click arrow to open details
object Endpoint "CPD4icimase01.example.es" {
        host = "CPD4icimase01.example.es"
        port = "5665"
}

object Zone "master" {
        endpoints = [ "CPD4icimase01.example.es" ]
}

object Endpoint "CPD1icisate01.example.es" {
}

object Zone "alcala" {
        endpoints = [ "CPD1icisate01.example.es" ]
        parent = "master"
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true
}
  • SATELLITE CPD2icisate01.example.es zone.conf
click arrow to open details
object Endpoint "CPD4icimase01.example.es" {
        host = "CPD4icimase01.example.es"
        port = "5665"
}

object Zone "master" {
        endpoints = [ "CPD4icimase01.example.es" ]
}

object Endpoint "CPD2icisate01.example.es" {
}

object Zone "yecora" {
        endpoints = [ "CPD2icisate01.example.es" ]
        parent = "master"
}

object Zone "global-templates" {
        global = true
}

object Zone "director-global" {
        global = true
}
  • I include the configuration of a host I am using for testing purposes
click arrow to open details
object Host "linuxmgr01" {
  import "generic-host"
  import "ssh-agent"
  address = "linuxmgr01"
  groups = [ "linux-servers" ]
  groups += [ "labtest" ]
  vars.os = "Linux"
  vars.os_version = "RHEL7"
  vars.tag = [ "salt" ]
  vars.tag += [ "test" ]

  vars.disks["disk /"] = { disk_partitions = "/" }
  vars.iostats_disks["iostats_disk sda" ] = { iostats_disk = "sda" }

  zone = "alcala"
  vars.agent_endpoint = name
}
  • And one satellite
click arrow to open details
object Host "CPD1icisate01" {
  import "generic-host"
  import "ssh-agent"
  address = "CPD1icisate01"
  groups = [ "linux-servers" ]
  groups += [ "labtest" ]
  vars.os = "Linux"
  vars.os_version = "RHEL8"
  vars.tag = [ "icinga" ]
  vars.tag += [ "test" ]

  vars.disks["disk /"] = { disk_partitions = "/" }
  vars.iostats_disks["iostats_disk sda" ] = { iostats_disk = "sda" }

  zone = "master"
  vars.agent_endpoint = name
}

If you need more information please ask, I’m starting to get a little desperate.

Thanks in advance

You have defined agent-health in the directory of a satellite zone, hence, I’d assume you did this for other checks as well. This may result in missing service objects and a check is not executed or reported to the parent. You can verify existing/missing objects using icinga2 object list .... In general, it’s best practice to use global zone(s) instead.

Hi,

Thanks for replying!!! maybe I got it worng, but I don’t have any command or service in the satellite zones I have them in global. The only checks I have in the zones are the health checks.

This is the content of the directory:
Should I move the agent-health to the global-template (I did a quick test and it didn’t work either).

/etc/icinga2/zones.d
/etc/icinga2/zones.d # tree
.
├── alcala
│   └── alcala_health.conf
├── global-templates
│   ├── commands.conf
│   └── templates.conf
├── master
│   └── master_health.conf
├── README
└── yecora
    └── yecora_health.conf

4 directories, 6 files

This is an agent-health for instance

/etc/icinga2/zones.d/alcala/alcala_health.conf
/etc/icinga2/zones.d/alcala/alcala_health.conf
apply Service "agent-health" {
  check_command = "cluster-zone"
  display_name = "agent-health-" + host.name

  vars.cluster_zone = host.name

  assign where host.zone == "alcala" && host.vars.agent_endpoint
}

apply Dependency "agent-health-check" to Service {
  parent_service_name = "agent-health"

  states = [ OK ] // Fail if the parent service state switches to NOT-OK
  disable_notifications = true

  assign where host.zone == "alcala" && host.vars.agent_endpoint
// Automatically assigns all agent endpoint checks as child services on the matched host
  ignore where service.name == "agent-health" // Avoid a self reference from child to parent
}

Regards,
Enrique

… and this global zone is defined on both satellites? Did you check if required objects are existing on all nodes?

That’s the missing piece. I needed to check if the objects existed on all nodes.That, and not having all the zones defined in all the satellites, the configuration was “stuck” in them indicating that the zone did not exist.

Thank you so much!

In case anyone reading the post wants to know the objects without having to reboot.
icinga2 object list |grep ^Object|awk -F -' '{print $4}'|sort|uniq -c