Problem with setup distributed monitoring

  • The Icinga 2 network monitoring daemon (version: r2.12.3-1)
  • Debian 10
  • IcingaWeb + director
  • Disabled features: command compatlog debuglog elasticsearch gelf graphite icingadb influxdb livestatus opentsdb perfdata statusdata syslog
    Enabled features: api checker ido-mysql mainlog notification

Hi everyone, I need advice please. I’m starting with icing now and I would like to implement the MASTER <-> SATELLITE <-> CLIENT connection. I currently have master + web installed on deb10 according to the documentation that runs on the internet, then a satellite at the client’s infrastructure, and then I wanted to connect an agent to that satellite that also works on the client’s local infrastructure. Basically, the state is such that the master has a public IP and then neither the satellite nor the agent anymore and they only go locally behind them. For example, I can’t run check_memory from the master on the agent behind the satellite. I tried to set it up using the documentation and without success. Thank you for any advice. I attach the zones below.

Master:

/*
 * Generated by Icinga 2 node setup commands
 * on 2021-03-17 21:46:13 +0100
 */

object Endpoint "icinga.[domain].cz" {
}

object Endpoint "icinga-1.infra" {    <---- Satellite
        host = "icinga-1.infra"
}

object Zone "master" {
	endpoints = [ "icinga.[domain].cz" ]
}

object Zone "global-templates" {
	global = true
}

object Zone "director-global" {
	global = true
}

object Zone "client-ABC" {
	endpoints = [ "icinga-1.infra" ]
        parent = "master"
}

Satellite:

/*
 * Generated by Icinga 2 node setup commands
 * on 2021-03-17 22:13:07 +0100
 */

object Endpoint "icinga.[domain].cz" {
	host = "icinga.[domain].cz"
	port = "5665"
}

object Endpoint "production-1.infra" {
        host = "192.168.1.103"
}


object Zone "master" {
	endpoints = [ "icinga.[domain].cz" ]
}

object Endpoint "icinga-1.infra" {
}

object Zone "client-ABC" {
	endpoints = [ "icinga-1.infra","production-1.infra" ]
	parent = "master"
}

object Zone "global-templates" {
	global = true
}

object Zone "director-global" {
	global = true
}

Agent:

/*

* Generated by Icinga 2 node setup commands

* on 2021-03-18 09:56:47 +0100

*/

object Endpoint "icinga-1.infra" {

host = "192.168.1.106"

port = "5665"

}

object Zone "master" {

endpoints = [ "icinga-1.infra" ]

}

object Zone "global-templates" {

global = true

}

object Zone "director-global" {

global = true

}

object Endpoint NodeName {

}

object Zone ZoneName {

endpoints = [ NodeName ]

parent = "master"

}

Thank you so much for your advice and ideas

Hi & welcome to the icinga community,

This will only work if icinga-1.infra could be resolved at your master and the master is able to connect to the satellite. If not, simply remove it and let the satellite connect to your master.

Since you are using the director, you don’t need to define agent’s zone and endpoint objects manually. Please remove both from master’s and satellite’s zones.conf.

The satellite zone object is missing in your master’s zones.conf.

I can’t identify the reason for this object. The satellite zone object is missing in your satellite’s zones.conf.

Did you configure Run on agent?

@rsx
Thank you for the quick response, so if I understand correctly, I should already have an Agent endpoint in the master and I should not have a satellite there. So I’m not going to have that agent’s endpoint in the satellite configuration?

icinga.[domain].cz = Master
icinga-1.infra = Satellite
production-1 = Agent

Do you mean to delete that client-ABC zone?

So then the master will look like this?

/*
 * Generated by Icinga 2 node setup commands
 * on 2021-03-17 21:46:13 +0100
 */

object Endpoint "icinga.[domain].cz" {
}

object Zone "master" {
	endpoints = [ "icinga.[domain].cz" ]
}

object Zone "global-templates" {
	global = true
}

object Zone "director-global" {
	global = true
}

And the satellite like this?

/*
 * Generated by Icinga 2 node setup commands
 * on 2021-03-17 22:13:07 +0100
 */

object Endpoint "icinga.[domain].cz" {
	host = "icinga.[domain].cz"
	port = "5665"
}

object Endpoint "production-1.infra" {
        host = "192.168.1.103"
}


object Zone "master" {
	endpoints = [ "icinga.[domain].cz" ]
}

object Endpoint "icinga-1.infra" {
}

object Zone "global-templates" {
	global = true
}

object Zone "director-global" {
	global = true
}

Yes, I chose to run on the agent, but unfortunately I had this error, which will be solved by adding an endpoint to the master.

Error: Validation failed for object 'production-1.infra.qop!RAM' of type 'Service'; Attribute 'command_endpoint': Object 'production-1.infra.qop' of type 'Endpoint' does not exist.

Thank you

Your master’s zones.conf needs master’s and every satellite’s zone and endpoint objects.
Satellite’s zones.conf needs zone and endpoint objects of that satellite plus master’s zone and endpoint objects.

When you add a host object in the director for the machine where the agent is installed, the director generates zone and endpoint objects automatically.

This sounds like your host object does not have a Cluster Zone configured in the director.

@rsx
So basically all I need is the “Director-global” Zone object on both the master and the satellite. But for example, where the endpoint will be located on the satellite. Will it be the endpoint for the director-global zone?

You need to define your zones.conf as described and then run kick start wizard within the director. After that you can chose the appropriate Cluster Zone for your clients e.g. icinga-1.infra if that is the parent of that client.

Zone and endpoint object for your satellite is missing in you master’s `zones.conf’.

Please do not add zone and endpoint objects in the director at all.

@rsx
Thank you, I made the settings and now the director wizard has imported the zone and the endpoint. And when I performed the wizard on the agent behind the satellite again, I saw a certificate request on the master. All services at the client are in the PENDING state and there is no error in the log. And on the master in the log I have an error with the CN ticket (maybe it will be caused by the fact that I performed the setup of the guest several times and the creation of a ticket)

Master log:
[2021-03-20 13:14:45 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/director-global//director/service_templates.conf

[2021-03-20 13:14:45 +0100] information/ApiListener: Started new listener on '[0.0.0.0]:5665'

[2021-03-20 13:14:45 +0100] information/DbConnection: 'ido-mysql' started.

[2021-03-20 13:14:45 +0100] information/NotificationComponent: 'notification' started.

[2021-03-20 13:14:45 +0100] information/CheckerComponent: 'checker' started.

[2021-03-20 13:14:45 +0100] information/ConfigItem: Activated all objects.

[2021-03-20 13:14:45 +0100] information/IdoMysqlConnection: 'ido-mysql' resumed.

[2021-03-20 13:14:45 +0100] information/DbConnection: Resuming IDO connection: ido-mysql

[2021-03-20 13:14:45 +0100] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.3')

[2021-03-20 13:14:45 +0100] information/IdoMysqlConnection: Finished reconnecting to 'ido-mysql' database 'icinga2' in 0.0548069 second(s).

[2021-03-20 13:14:54 +0100] information/ApiListener: New client connection for identity 'icinga-1.infra.XXX' from [116.202.51.52]:44252

[2021-03-20 13:14:54 +0100] information/ApiListener: Sending config updates for endpoint 'icinga-1.infra.XXX' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Finished sending config file updates for endpoint 'icinga-1.infra.XXX' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Syncing runtime objects to endpoint 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Finished syncing runtime objects to endpoint 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Finished sending runtime config updates for endpoint 'icinga-1.infra.XXX' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Sending replay log for endpoint 'icinga-1.infra.qop' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: Received certificate request for CN 'icinga-1.infra.XXX' signed by our CA.

[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: The certificate for CN 'icinga-1.infra.XXX' is valid and uptodate. Skipping automated renewal.

[2021-03-20 13:14:54 +0100] information/ApiListener: Finished sending replay log for endpoint 'icinga-1.infra.XXX' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/ApiListener: Finished syncing endpoint 'icinga-1.infra.XXX' in zone 'icinga-1.infra.XXX'.

[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: Received certificate request for CN 'production-1.infra.XXX' not signed by our CA: self signed certificate (code 18)

[2021-03-20 13:14:54 +0100] warning/JsonRpcConnection: Ticket '9f8dc806d47a57a15644b045581ef234d6ce7b66' for CN 'production-1.infra.XXX' is invalid.

[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: Received certificate request for CN 'production-1.infra.XXX' not signed by our CA: self signed certificate (code 18)

[2021-03-20 13:14:54 +0100] warning/JsonRpcConnection: Ticket 'c2da64017a52398849c24f4ac016d0171c9d4358' for CN 'production-1.infra.XXX' is invalid.

[2021-03-20 13:14:55 +0100] information/WorkQueue: #6 (ApiListener, RelayQueue) items: 0, rate: 0.0666667/s (4/min 4/5min 4/15min);

[2021-03-20 13:14:55 +0100] information/WorkQueue: #7 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);

[2021-03-20 13:14:55 +0100] information/IdoMysqlConnection: Pending queries: 9 (Input: 3/s; Output: 2/s)

Satellite:

[2021-03-20 13:17:03 +0100] information/ApiListener: Requesting new certificate for this Icinga instance from endpoint 'icinga.XXX'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Sending config updates for endpoint 'icinga.-XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished sending config file updates for endpoint 'icinga.XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Syncing runtime objects to endpoint 'icinga.XXX'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished syncing runtime objects to endpoint 'icinga.XXX'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished sending runtime config updates for endpoint 'icinga.XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Sending replay log for endpoint 'icinga.XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished sending replay log for endpoint 'icinga.XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished syncing endpoint 'icinga.XXX' in zone 'master'.

[2021-03-20 13:17:03 +0100] information/ApiListener: Finished reconnecting to endpoint 'icinga.XXX' via host 'icinga.XXX and port '5665'

[2021-03-20 13:17:12 +0100] information/WorkQueue: #5 (IdoMysqlConnection, ido-mysql) items: 0, rate: 0.366667/s (22/min 22/5min 22/15min);

[2021-03-20 13:17:12 +0100] information/WorkQueue: #6 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);

[2021-03-20 13:17:12 +0100] information/WorkQueue: #7 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);

[2021-03-20 13:17:20 +0100] information/ApiListener: New client connection for identity 'production-1.infra.XXX' from [192.168.1.103]:47718 (no Endpoint object found for identity)

[2021-03-20 13:17:20 +0100] information/JsonRpcConnection: Received certificate request for CN 'production-1.infra.XXX' signed by our CA.

[2021-03-20 13:17:20 +0100] information/JsonRpcConnection: The certificate for CN 'production-1.infra.qop.cz' is valid and uptodate. Skipping automated renewal.

Agent:

2021-03-20 13:18:04 +0100] information/ApiListener: Adding new listener on port '5665'

[2021-03-20 13:18:04 +0100] information/ApiListener: Reconnecting to endpoint 'icinga-1.infra.XXX' via host '192.168.1.106' and port '5665'

[2021-03-20 13:18:04 +0100] information/CheckerComponent: 'checker' started.

[2021-03-20 13:18:04 +0100] information/ConfigItem: Activated all objects.

[2021-03-20 13:18:04 +0100] information/ApiListener: New client connection for identity 'icinga-1.infra.XXX' to [192.168.1.106]:5665

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished reconnecting to endpoint 'icinga-1.infra.XXX' via host '192.168.1.106' and port '5665'

[2021-03-20 13:18:04 +0100] information/ApiListener: Requesting new certificate for this Icinga instance from endpoint 'icinga-1.infra.XXX.

[2021-03-20 13:18:04 +0100] information/ApiListener: Sending config updates for endpoint 'icinga-1.infra.XX' in zone 'master'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished sending config file updates for endpoint 'icinga-1.infra.XXX' in zone 'master'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Syncing runtime objects to endpoint 'icinga-1.infra.XXX'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished syncing runtime objects to endpoint 'icinga-1.infra.XX'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished sending runtime config updates for endpoint 'icinga-1.infra.XXX' in zone 'master'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Sending replay log for endpoint 'icinga-1.infra.XXX' in zone 'master'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Replayed 97 messages.

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished sending replay log for endpoint 'icinga-1.infra.XXX' in zone 'master'.

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished syncing endpoint 'icinga-1.infra.XXX' in zone 'master'.

The services are set up as a run on agent

There is something wrong with the certificate of your agent:

[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: Received certificate request for CN 'production-1.infra.qop.cz' not signed by our CA: self signed certificate (code 18)
[2021-03-20 13:14:54 +0100] warning/JsonRpcConnection: Ticket '9f8dc806d47a57a15644b045581ef234d6ce7b66' for CN 'production-1.infra.qop.cz' is invalid.
[2021-03-20 13:14:54 +0100] information/JsonRpcConnection: Received certificate request for CN 'production-1.infra.qop' not signed by our CA: self signed certificate (code 18)
[2021-03-20 13:14:54 +0100] warning/JsonRpcConnection: Ticket 'c2da64017a52398849c24f4ac016d0171c9d4358' for CN 'production-1.infra.qop' is invalid.

Some minutes lates if looks ok tough:

[2021-03-20 13:17:20 +0100] information/JsonRpcConnection: The certificate for CN 'production-1.infra.qop.cz' is valid and uptodate. Skipping automated renewal.

It looks like you didn’t create a host object in the director:

[2021-03-20 13:17:20 +0100] information/ApiListener: New client connection for identity 'production-1.infra.qop.cz' from [192.168.1.103]:47718 (no Endpoint object found for identity)

This looks strange to me:

[2021-03-20 13:18:04 +0100] information/ApiListener: Finished syncing endpoint 'icinga-1.infra.qop' in zone 'master'.

Is icinga-1.infra.qop really an endpoint in your master zone?

@rsx
It works now ! thank you so much for the help :slight_smile: the problem was that I didn’t have a parrent on the master in the satellite zone :slight_smile: