Hi all,
I’m trying to setup Icinga2 with Icingaweb2 in my Ansible-ized environment. I’ve followed multiple guides, the last of those being Distributed Monitoring - Icinga 2
I have the following setup;
- 1 main server (temp-manage)
- several servers (daemon being one of them)
All are running Ubuntu 20.04.3. Icinga version r2.13.1-1.
I’ve managed to setup the temp-manage as a master server, and have icingaweb2 running on that. I saw the server itself in there, plus the stats. Looking good! Then I added a simple remote host by adding an IP in the hosts.conf in conf.d, and wow, I saw the ping response as a node in the web interface. Cool, time for the next step - installing the agents because I need more detailed information of the machines themselves.
Now I’ve been stuck here for the past 2 days, and I can’t figure out what I’m doing wrong.
Master Node (temp-manage / 172.16.200.1)
/etc/icinga2/icinga.conf
include "constants.conf"
include "zones.conf"
include <itl>
include <plugins>
include <plugins-contrib>
include <manubulon>
include <windows-plugins>
include <nscp>
include "features-enabled/*.conf"
include "conf.d/api-users.conf"
// Note that I've disabled the conf.d include
/etc/icinga2/zones.conf
object Endpoint "temp-manage" {
}
object Endpoint "daemon" {
host = "172.16.0.1"
port = "5665"
log_duration = 0
}
object Zone "daemon" {
endpoints = [ "daemon" ]
parent = "master"
}
object Zone "master" {
endpoints = [ "temp-manage" ]
}
object Zone "global-templates" {
global = true
}
object Zone "director-global" {
global = true
}
/etc/icinga2/zones.d/master/hosts.conf
object Host "daemon" {
check_command = "hostalive"
address = "172.16.0.1"
vars.agent_endpoint = name
}
/etc/icinga2/zones.d/master/services.conf
apply Service "ping4" {
check_command = "ping4"
//check is executed on the master node
assign where host.address
}
apply Service "disk" {
check_command = "disk"
// Specify the remote agent as command execution endpoint, fetch the host custom variable
command_endpoint = host.vars.agent_endpoint
// Only assign where a host is marked as agent endpoint
assign where host.vars.agent_endpoint
}
Daemon Node (daemon/ 172.16.0.1)
/etc/icinga2/icinga.conf
include "constants.conf"
include "zones.conf"
include <itl>
include <plugins>
include <plugins-contrib>
include <manubulon>
include <windows-plugins>
include <nscp>
include "features-enabled/*.conf"
include "conf.d/api-users.conf"
/etc/icinga2/zones.conf
object Endpoint "temp-manage" {
host = "172.16.200.1"
port = "5665"
}
object Endpoint "daemon" {
host = 172.16.0.1
log_duration = 0
}
object Zone "master" {
endpoints = [ "temp-manage" ]
}
object Zone "daemon" {
endpoints = [ "daemon" ]
parent = "master"
}
object Zone "global-templates" {
global = true
}
object Zone "director-global" {
global = true
}
Now here’s the weird thing; The nodes seem to be connecting just fine (did the whole signature dance and everything). Here’s some stuff from the logs:
Temp-Manage
[2021-10-09 00:43:06 +0000] information/FileLogger: 'main-log' started.
[2021-10-09 00:43:06 +0000] information/ApiListener: 'api' started.
[2021-10-09 00:43:06 +0000] information/ApiListener: Started new listener on '[::]:5665'
[2021-10-09 00:43:06 +0000] information/DbConnection: 'ido-mysql' started.
[2021-10-09 00:43:06 +0000] information/CheckerComponent: 'checker' started.
[2021-10-09 00:43:06 +0000] information/ConfigItem: Activated all objects.
[2021-10-09 00:43:06 +0000] information/ApiListener: Reconnecting to endpoint 'daemon' via host '172.16.0.1' and port '5665'
[2021-10-09 00:43:06 +0000] information/IdoMysqlConnection: 'ido-mysql' resumed.
[2021-10-09 00:43:06 +0000] information/DbConnection: Resuming IDO connection: ido-mysql
[2021-10-09 00:43:06 +0000] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.15.0')
[2021-10-09 00:43:06 +0000] information/IdoMysqlConnection: Finished reconnecting to 'ido-mysql' database 'icinga' in 0.010608 second(s).
[2021-10-09 00:43:15 +0000] information/ApiListener: New client connection for identity 'daemon' from [::ffff:172.16.0.1]:46068
[2021-10-09 00:43:15 +0000] information/ApiListener: Sending config updates for endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Finished sending config file updates for endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Syncing runtime objects to endpoint 'daemon'.
[2021-10-09 00:43:15 +0000] information/JsonRpcConnection: Received certificate request for CN 'daemon' signed by our CA.
[2021-10-09 00:43:15 +0000] information/JsonRpcConnection: The certificate for CN 'daemon' is valid and uptodate. Skipping automated renewal.
[2021-10-09 00:43:15 +0000] information/ApiListener: Finished syncing runtime objects to endpoint 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Finished sending runtime config updates for endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Sending replay log for endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Finished sending replay log for endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:15 +0000] information/ApiListener: Finished syncing endpoint 'daemon' in zone 'daemon'.
[2021-10-09 00:43:16 +0000] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2021-10-09 00:43:16 +0000] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2021-10-09 00:43:16 +0000] information/IdoMysqlConnection: Pending queries: 5 (Input: 3/s; Output: 3/s)
Daemon
[2021-10-09 00:41:35 +0000] information/FileLogger: 'main-log' started.
[2021-10-09 00:41:35 +0000] information/ApiListener: 'api' started.
[2021-10-09 00:41:35 +0000] information/ApiListener: Started new listener on '[::]:5665'
[2021-10-09 00:41:35 +0000] information/ApiListener: Reconnecting to endpoint 'temp-manage' via host '172.16.200.1' and port '5665'
[2021-10-09 00:41:35 +0000] information/CheckerComponent: 'checker' started.
[2021-10-09 00:41:35 +0000] information/ConfigItem: Activated all objects.
[2021-10-09 00:41:35 +0000] information/ApiListener: New client connection for identity 'temp-manage' to [172.16.200.1]:5665
[2021-10-09 00:41:35 +0000] information/ApiListener: Requesting new certificate for this Icinga instance from endpoint 'temp-manage'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished reconnecting to endpoint 'temp-manage' via host '172.16.200.1' and port '5665'
[2021-10-09 00:41:35 +0000] information/ApiListener: Sending config updates for endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished sending config file updates for endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Syncing runtime objects to endpoint 'temp-manage'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished syncing runtime objects to endpoint 'temp-manage'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished sending runtime config updates for endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Sending replay log for endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished sending replay log for endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Finished syncing endpoint 'temp-manage' in zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Applying config update from endpoint 'temp-manage' of zone 'master'.
[2021-10-09 00:41:35 +0000] information/ApiListener: Received configuration updates (0) from endpoint 'temp-manage' are equal to production, skipping validation and reload.
[2021-10-09 00:41:45 +0000] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2021-10-09 00:41:45 +0000] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
This doesn’t state anything obviously wrong. Then we have the feature lists on both;
Temp-Manage
Disabled features: command compatlog debuglog elasticsearch gelf graphite icingadb influxdb influxdb2 livestatus notification opentsdb perfdata statusdata syslog
Enabled features: api checker ido-mysql mainlog
Daemon
Disabled features: command compatlog debuglog elasticsearch gelf graphite icingadb influxdb influxdb2 livestatus notification opentsdb perfdata statusdata syslog
Enabled features: api checker mainlog
And the daemon -C
Temp-Manage
[2021-10-09 00:47:25 +0000] information/cli: Icinga application loader (version: r2.13.1-1)
[2021-10-09 00:47:25 +0000] information/cli: Loading configuration file(s).
[2021-10-09 00:47:25 +0000] information/ConfigItem: Committing config item(s).
[2021-10-09 00:47:25 +0000] information/ApiListener: My API identity: temp-manage
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 1 FileLogger.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 9 Zones.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 7 Endpoints.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 3 ApiUsers.
[2021-10-09 00:47:25 +0000] information/ConfigItem: Instantiated 244 CheckCommands.
[2021-10-09 00:47:25 +0000] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-10-09 00:47:25 +0000] information/cli: Finished validating the configuration file(s).
Daemon
[2021-10-09 00:47:04 +0000] information/cli: Icinga application loader (version: r2.13.1-1)
[2021-10-09 00:47:04 +0000] information/cli: Loading configuration file(s).
[2021-10-09 00:47:05 +0000] information/ConfigItem: Committing config item(s).
[2021-10-09 00:47:05 +0000] information/ApiListener: My API identity: daemon
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 1 FileLogger.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 4 Zones.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 2 Endpoints.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 1 ApiUser.
[2021-10-09 00:47:05 +0000] information/ConfigItem: Instantiated 244 CheckCommands.
[2021-10-09 00:47:05 +0000] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-10-09 00:47:05 +0000] information/cli: Finished validating the configuration file(s).
(for the properly counting people around here - I have omitted a couple of zones and endpoints from the config lines here. There’s 4 more servers which have identical setups and definitions - I haven’t installed the agent on those yet. Fun fact: because icinga’s not installed on those, I’m seeing connection issues to those machines in the temp-manage logs, but thats expected, obviously.)
All the config files and folders are owned by nagios and have the proper permissions.
Here’s the weird thing - shouldn’t there be hosts listed? Or is that correct using this setup? I always have no host line being returned. The object list also does not return any hosts (on neither machine).
It does have all the zones. When I used the conf.d setup I did see hosts, and I also saw them in the web interface.
I really hope someone can point me in the right direction, because I really can’t figure out where to search next.
Thanks in advance.