Satellite does not run checks

Hello,

I would like to know what could be behind my current problem.
Successfully installed Icinga2 even with web gui to one VM inside vDC behind NAT.
Checks locally works OK but its not working for satellite which I installed to the second VM inside second vDC behind NAT.
port forward for 5665 is OK and working on both sides.

I followed documentation but with no luck figure it out.
By to logs connections are OK, sync of config is OK. Check on master zone works great but on satellite no :frowning:

Because of the NAT its unable to make check commands by remote so I decided for second option: Top Down Config Sync

Can you please point me what is the problem?

Master
zones.conf

object Endpoint "control_machine" {
}

object Zone "master" {
        endpoints = [ "control_machine" ]
}

object Endpoint "vm_1" {
        host = "SATELLITE-IP"
        port = "5665"
}

object Zone "vDC_1" {
        parent = "master"
        endpoints = [ "vm_1" ]
}

object Zone "global-templates" {
        global = true
}

zones.d/vDC_1
hosts.conf

object Host "vm_1" {
  check_command = "hostalive"
  address = "192.168.29.101"
  zone = "vDC_1" //optional trick: sync the required host object to the client, but enforce the "master" zone to execute the check
}

services.conf

object Service "disk" {
  host_name = "vm_1"
  check_command = "disk"
}

object Service "load" {
  host_name = "vm_1"
  check_command = "load"
}

Satellite
zones.conf

object Endpoint "control_machine" {
        host = "MASTER-IP"
        port = "5665"
}

object Zone "master" {
        endpoints = [ "control_machine" ]
}

object Endpoint "vm_1" {
}

object Zone "vDC_1" {
        endpoints = [ "vm_1" ]
        parent = "master"
}

object Zone "global-templates" {
        global = true
}

Is it OK that files on satellity are only in cache? And commands “icinga2 object list --type Host” & “icinga2 object list --type Service”
return nothing on Satellite? While on Master it shows output

MASTER-IP and SATELLITE-IP was replaced for security reasons

The goal is to have one Master node with multiple Satellites which have local access to the rest of VM inside of vDC.

My topic is similar to this one
But I have nothing wrong in configs in my opinion.

Thanks in advance

You’re mixing things here. The connection direction is important - either up-down or down-up. Once the connection is established, the upper layers may run commands or sync configuration.

Reading your configuration makes me thing that the master should sync the satellite zone vDC_1 to the satellite endpoint with objects located in /etc/icinga2/zones.d/vDC_1.

Right now, the host/service objects inside the configuration should be synced to the satellite. You can verify that inside its logs - it should tell you about the zones it received configuration for, and also trigger a reload. Can you share them?

Cheers,
Michael

Yes the configuration should be located only in master and synced to each satellite add run commands locally and send the result back to master.

Right after icinga2 daemon -C i see this in dabuglog on satellite

[2019-07-30 16:07:44 +0200] notice/JsonRpcConnection: Received 'event::Heartbeat' message from 'control_machine'
[2019-07-30 16:07:48 +0200] notice/JsonRpcConnection: Received 'log::SetLogPosition' message from 'control_machine'
[2019-07-30 16:07:49 +0200] notice/CheckerComponent: Pending checkables: 0; Idle checkables: 0; Checks/s: 0
[2019-07-30 16:07:49 +0200] notice/ApiListener: Setting log position for identity 'control_machine': 2019/07/30 14:56:15
[2019-07-30 16:07:51 +0200] notice/ThreadPool: Pool #1: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 0.00219293%
[2019-07-30 16:07:52 +0200] notice/ThreadPool: Pool #2: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 4.94066e-322%
[2019-07-30 16:07:53 +0200] notice/JsonRpcConnection: Received 'log::SetLogPosition' message from 'control_machine'
[2019-07-30 16:07:54 +0200] debug/ApiListener: Not connecting to Endpoint 'vm_1' because that's us.
[2019-07-30 16:07:54 +0200] notice/CheckerComponent: Pending checkables: 0; Idle checkables: 0; Checks/s: 0
[2019-07-30 16:07:54 +0200] debug/ApiListener: Not connecting to Endpoint 'control_machine' because we're already connected to it.
[2019-07-30 16:07:54 +0200] notice/ApiListener: Current zone master: vm_1
[2019-07-30 16:07:54 +0200] notice/ApiListener: Connected endpoints: control_machine (1)
[2019-07-30 16:07:54 +0200] notice/ApiListener: Setting log position for identity 'control_machine': 2019/07/30 14:56:15
[2019-07-30 16:07:54 +0200] notice/JsonRpcConnection: Received 'event::Heartbeat' message from 'control_machine'
[2019-07-30 16:07:58 +0200] notice/JsonRpcConnection: Received 'log::SetLogPosition' message from 'control_machine'
[2019-07-30 16:07:59 +0200] notice/CheckerComponent: Pending checkables: 0; Idle checkables: 0; Checks/s: 0
[2019-07-30 16:07:59 +0200] notice/ApiListener: Setting log position for identity 'control_machine': 2019/07/30 14:56:15

Files are synced in cache on satellite in /var/lib/icinga2/api/zones/vDC_1

Whats diffrent is result of command: icinga2 daemon -c on satellite

[2019-07-30 16:15:59 +0200] information/cli: Icinga application loader (version: r2.10.5-1)
[2019-07-30 16:15:59 +0200] information/cli: Loading configuration file(s).
[2019-07-30 16:15:59 +0200] information/ConfigItem: Committing config item(s).
[2019-07-30 16:15:59 +0200] information/ApiListener: My API identity: vDC_1
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 2 FileLoggers.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 4 Zones.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 2 Endpoints.
[2019-07-30 16:15:59 +0200] information/ConfigItem: Instantiated 215 CheckCommands.
[2019-07-30 16:15:59 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2019-07-30 16:15:59 +0200] information/cli: Finished validating the configuration file(s).

unlike on master it shows:

[2019-07-30 16:18:26 +0200] information/cli: Icinga application loader (version: r2.10.5-1)
[2019-07-30 16:18:26 +0200] information/cli: Loading configuration file(s).
[2019-07-30 16:18:26 +0200] information/ConfigItem: Committing config item(s).
[2019-07-30 16:18:26 +0200] information/ApiListener: My API identity: control_machine
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 4 Services.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 2 Hosts.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 2 FileLoggers.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 4 Zones.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 2 Endpoints.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 2 ApiUsers.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2019-07-30 16:18:26 +0200] information/ConfigItem: Instantiated 215 CheckCommands.
[2019-07-30 16:18:26 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2019-07-30 16:18:26 +0200] information/cli: Finished validating the configuration file(s).

But it can be caused by stored configuration persistent only on master.

If you want something more feel free to ask.

The requested details are visible on the normal log as Information already, search for vDC_1 being the synced zone.

You can also run a manual config validation on the satellite with notice level to see whether the stored configuration is really included.

icinga2 daemon -C -x notice | grep vDC_1

Also, try to manually restart the satellite and check whether the synced configuration worked.

Cheers,
Michael

Result of “icinga2 daemon -C -x notice | grep vDC_1”:

[2019-07-31 09:39:07 +0200] notice/config: Ignoring non local config include for zone 'vDC_1': We already have an authoritative copy included.
[2019-07-31 09:39:07 +0200] information/ApiListener: My API identity: vDC_1

Also icinga2.log after manual restart of the service:

[2019-07-31 09:46:24 +0200] information/Application: Received request to shut down.
[2019-07-31 09:46:25 +0200] information/Application: Shutting down...
[2019-07-31 09:46:25 +0200] information/ApiListener: 'api' stopped.
[2019-07-31 09:46:25 +0200] information/CheckerComponent: 'checker' stopped.
[2019-07-31 09:46:25 +0200] information/FileLogger: 'main-log' started.
[2019-07-31 09:46:25 +0200] information/FileLogger: 'debug-file' started.
[2019-07-31 09:46:25 +0200] information/ApiListener: 'api' started.
[2019-07-31 09:46:25 +0200] information/ApiListener: Started new listener on '[0.0.0.0]:5665'
[2019-07-31 09:46:25 +0200] information/CheckerComponent: 'checker' started.
[2019-07-31 09:46:25 +0200] information/ConfigItem: Activated all objects.
[2019-07-31 09:46:25 +0200] information/cli: Closing console log.
[2019-07-31 09:46:25 +0200] information/ApiListener: Reconnecting to endpoint 'control_machine' via host 'MASTER-IP' and port '5665'
[2019-07-31 09:46:25 +0200] information/ApiListener: New client connection for identity 'control_machine' to [MASTER-IP]:5665
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished reconnecting to endpoint 'control_machine' via host 'MASTER-IP' and port '5665'
[2019-07-31 09:46:25 +0200] information/ApiListener: Requesting new certificate for this Icinga instance from endpoint 'control_machine'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Sending config updates for endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished sending config file updates for endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Syncing runtime objects to endpoint 'control_machine'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished syncing runtime objects to endpoint 'control_machine'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished sending runtime config updates for endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Sending replay log for endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Applying config update from endpoint 'control_machine' of zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished sending replay log for endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:25 +0200] information/ApiListener: Finished syncing endpoint 'control_machine' in zone 'master'.
[2019-07-31 09:46:35 +0200] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2019-07-31 09:46:35 +0200] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0.0166667/s (1/min 1/5min 1/15min);
[2019-07-31 09:46:35 +0200] information/WorkQueue: #9 (JsonRpcConnection, #0) items: 0, rate: 0.0666667/s (4/min 4/5min 4/15min);
[2019-07-31 09:46:35 +0200] information/WorkQueue: #10 (JsonRpcConnection, #1) items: 0, rate:  0/s (0/min 0/5min 0/15min);

It looks like its not syncing config from master for zone vDC_1
But on satellite the actual hosts ans services are present in files under

/var/lib/icinga2/api/zones/vDC_1

Should satellite after sucessful sync return the same like master with commands

icinga2 object list --type Host
icinga2 object list --type Service

Thanks

Does the satellite have anything inside zones.d?

ls -laR /etc/icinga2/zones.d

The log message points to that.

[2019-07-31 09:39:07 +0200] notice/config: Ignoring non local config include for zone 'vDC_1': We already have an authoritative copy included.

Cheers,
Michael

Its just readme file

/etc/icinga2/zones.d:
total 12
drwxr-xr-x 2 root   root   4096 Jul 29 15:30 .
drwxr-x--- 8 nagios nagios 4096 Jul 30 15:03 ..
-rw-r--r-- 1 root   root    133 May 23 15:08 README

Tried to delete and run again command unfortunatelly with the same message.
Should I do somehow manual flush and try to make fresh sync?

And the message means that its ignoring config files from master totally or just it think its synced already?

It thinks that it has an authoritative copy included. This works in two ways:

  • Either /etc/icinga2/zones.d contains directories with config files within
  • Or the synced content in /var/lib/icinga2/api/zones has the .authoritative marker file in it, e.g. from a manual rsync from the master.

Please show the content of the synced directory:

ls -laR /var/lib/icinga2/api/zones

Cheers,
Michael

Ok I understand that. Its more weird now while the synced files exists

/var/lib/icinga2/api/zones:
total 12
drwxr-x--- 3 nagios nagios 4096 Jul 29 15:28 .
drwxr-x--- 6 nagios nagios 4096 Jul 29 10:02 ..
drwx------ 3 nagios nagios 4096 Jul 29 15:28 vDC_1

/var/lib/icinga2/api/zones/vDC_1:
total 16
drwx------ 3 nagios nagios 4096 Jul 29 15:28 .
drwxr-x--- 3 nagios nagios 4096 Jul 29 15:28 ..
-rw-r--r-- 1 nagios nagios    0 Jul 29 15:28 .authoritative
drwxr-xr-x 2 nagios nagios 4096 Jul 29 15:28 _etc
-rw-r--r-- 1 nagios nagios   17 Jul 30 14:56 .timestamp

/var/lib/icinga2/api/zones/vDC_1/_etc:
total 16
drwxr-xr-x 2 nagios nagios 4096 Jul 29 15:28 .
drwx------ 3 nagios nagios 4096 Jul 29 15:28 ..
-rw-r--r-- 1 nagios nagios  226 Jul 30 12:19 hosts.conf
-rw-r--r-- 1 nagios nagios  351 Jul 30 09:47 services.conf

That file must not exist on the satellite.

For some reason, this was transferred there, either from tests or something else.

The safest way is to manually purge the content, e.g. rm -rf /var/lib/icinga2/api/zones/* and then restart icinga2 on the satellite.

Cheers,
Michael

1 Like

That was the GOAL!
Now it works, checks are completed and successfully send back to master.
Many thanks for your active support!

1 Like

Hey Guys,

I have the same Problems… I got in the log:

notice/CheckerComponent: Pending checkables: 0; Idle checkables: 15; Checks/s: 0

No active checks are done on the satellite.

I checkt the /var/lib/icinga2/api/zones/* files and resynced it with the master everything fine.

My infos:

root@satellite:/var/lib/icinga2/api/log# icinga2 daemon -C

 [2019-11-27 18:58:08 +0100] information/cli: Icinga application loader (version: r2.10.3-1)
    [2019-11-27 18:58:08 +0100] information/cli: Loading configuration file(s).
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Committing config item(s).
    [2019-11-27 18:58:08 +0100] information/ApiListener: My API identity: satellite.icinga.stpdom.local
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'IMAP' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 1:0-1:19) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'SMTP' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 10:1-10:20) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'SSH' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 19:1-19:19) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'FTP' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 28:1-28:19) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'NRPE Check Load' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 37:1-37:31) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'NRPE Check APT' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 47:1-47:30) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'NRPE Check User' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 57:1-57:31) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'NRPE Check Procs' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 67:1-67:32) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] warning/ApplyRule: Apply rule 'NRPE Check SWAP' (in /var/lib/icinga2/api/zones/director-global/director/service_apply.conf: 77:1-77:31) for type 'Service' does not match anywhere!
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 1 ExternalCommandListener.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 12 Zones.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 1 ServiceGroup.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 2 Services.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 3 HostGroups.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 13 Hosts.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 10 Endpoints.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 2 FileLoggers.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 222 CheckCommands.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 2 ApiUsers.
    [2019-11-27 18:58:08 +0100] information/ConfigItem: Instantiated 1 ApiListener.
    [2019-11-27 18:58:08 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
    [2019-11-27 18:58:08 +0100] information/cli: Finished validating the configuration file(s).

Any Ideas ? Need more Infos?