New checks and possible zone configuration problem

Hi everyone,

I’m relatively new to Icinga and I’m still trying to understand its basics.
Icinga version is:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.12.0-1)

We have two zones, one has an Endpoint with master role and the other Zone has an Endpoint with satellite role.

Here is the zones.conf file on the master:

object Endpoint "master-endpoint" {
}

object Endpoint "satellite-endpoint" {
}
object Zone "master" {
    endpoints = [ "master-endpoint" ]
}

object Zone "satellite" {
    endpoints = [ "satellite-endpoint" ]
    parent = "master"
}

object Zone "global-templates" {
    global = true
}

object Zone "director-global" {
    global = true
}

Here is the zones.conf file on the satellite:

object Endpoint "master-endpoint" {
    host = "master-fqdn"
    port = "5665"
}

object Zone "master" {
    endpoints = [ "master-endpoint" ]
}

object Endpoint "satellite-endpoint" {
}

object Zone "satellite" {
    endpoints = [ "satellite-endpoint" ]
    parent = "master"
}

object Zone "global-templates" {
    global = true
}

object Zone "director-global" {
    global = true
}

I used generic names as “satellite-endpoint”, “master-fqdn” and etc here.
We followed the “Top down config sync” as described on the documentation here:
https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#top-down-config-sync

On the master, we have three directories in zones.d:

  1. global-templates: here we have defined checks that should work on all hosts, so in our case, the master endpoint and the satellite endpoint.
  2. master: here we have defined checks that should work only on the endpoints of the master zone, so only the master endpoint in our case.
  3. satellite: here we have defined checks that should work only on the endpoints of the satellite zone, so only the satellite endpoint in our case.

All the checks work correctly, but not the new ones. We are trying to create new checks for the satellite endpoint.
We created new scripts and we have put them in /usr/lib/nagios/plugins.
As the checks should work only on the satellite endpoint, we put them in the satellite directory of zones.d. We created a new CheckCommand object and a new Service object for each.

The problem is that the new checks are pending and on the dashboard is shown that the “check source” is the master endpoint, but shouldn’t it be the satellite endpoint as it is for every other check (disk, ping) on the satellite?

I think we have a problem in how zones should be configured but I’m not sure.

I know that CheckCommand objects and Services objects are being created, because of the command “icinga2 object list --type …”). I’ve confronted them, for example, with CheckCommand and Service for the ping check and I can find no difference.

Has anyone got any idea why it isn’t working?

Thank you very much and sorry if I did any mistake in the topic.

Hello @JasKaur!

Please share the not working configs with their locations.

Best,
AK

Hi, thank you for your reply.

I will share one of the check I’m trying to add with made-up names.
Yesterday, I noticed something very weird: the check works if I check the master, but it doesn’t (meaning it stays on pending) if I check the satellite.

I’ve put the new script in the pluginDir. I’ve created the CheckCommand object in “commands.conf” in the global-templates directory of zones.d like this:

object CheckCommand "checkcommand_name" { 
   command = [ PluginDir + "/new_check.sh" ]
}

Then for the master, I’ve created an Apply Service in “services.conf” in the master directory of zones.d like this:

apply Service "check-display-name" {
  import "generic-service"

  check_command = "checkcommand_name"

  assign where host.name == NodeName
}

And this check works: it isn’t pending, the output is correct, the notifications are being triggered.

I’ve tried to do the same thing on the satellite, as I want to monitor the satellite and the check on the master was just a try, but it doesn’t work, the check stays on pending.

For the satellite, I’ve created an Apply Service in “services.conf” in the satellite directory of zones.d like this:

apply Service "check-display-name"{
  import "generic-service"

  check_command = "checkcommand_name"

  assign where host.name == "satellite-fqdn"
}

The checks stays on pending, but if I force it with “check now” it checks (except for one check, that stays on pending anyway), but the output I see is the output of the command executed on the master, not on the satellite. Anyway, the check works only once, then it stays pending again until I hit the “check now” again, and the notifications are not being triggered.

Put all your applys in global zones.

You mean “global-templates”?

I tried just now and again the check on the master is being executed and the check on the satellite is pending.

Don’t do this. NodeName is a relative host name.

Yes, you are right.

Anyway, the checks on NodeName work, as the checks are defined on the master and they check on the master.

If I put

assign where host.name == "satellite-fdqn" 

the checks don’t work.

Even if I put

assign where host.address

the checks work on the master, but not on the satellite, they are just pending.

Please run icinga2 object list -t service --name 'YOUR_HOST!YOUR_SERVICE' on the satellite and share the output.

On the satellite there is no output.
On the master, objects are being created.

Can you tell me why is that so?

Maybe I can if you…

On the satellite there is no output at all.
If you want this is the output on the master:

Object 'satellite-host!service' of type 'Service':
  % declared in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43
  * __name = "satellite-host!service"
  * action_url = ""
  * check_command = "check_name"
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 63:4-63:30
  * check_interval = 60
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 33:3-33:21
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 69:4-69:22
  * check_period = ""
  * check_timeout = null
  * command_endpoint = ""
  * display_name = "service"
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = false
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 0
  * flapping_threshold_high = 30
  * flapping_threshold_low = 25
  * groups = [ ]
  * host_name = "satellite-fqdn"
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 5
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 32:3-32:24
  * name = "check"
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43
  * notes = ""
  * notes_url = ""
  * package = "_etc"
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43
  * retry_interval = 30
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 34:3-34:22
  * source_location
    * first_column = 1
    * first_line = 61
    * last_column = 43
    * last_line = 61
    * path = "/etc/icinga2/zones.d/global-templates/services.conf"
  * templates = [ "check", "generic-service" ]
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 31:1-31:34
  * type = "Service"
  * vars
    * checkString = "check_string"
      % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 67:4-67:35
    * logName = "log_name"
      % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 66:4-66:33
    * notification
      * mail
        % = modified in '/etc/icinga2/conf.d//templates.conf', lines 35:3-38:3
        * groups = [ "icingaadmins" ]
        * users = [ "usr1" ]
  * volatile = false
  * zone = "satellite-zone"
    % = modified in '/etc/icinga2/zones.d/global-templates/services.conf', lines 61:1-61:43

I used generic names for hosts, services, checks and arguments of check.

And icinga2 object list -t host --name satellite-host on master?

Here you are:

Object 'satellite-name' of type 'Host':
  % declared in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 1:0-1:25
  * __name = "satellite-name"
  * action_url = ""
  * address = "satellite-ip"
    % = modified in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 7:2-7:26
  * address6 = ""
  * check_command = "hostalive"
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 24:3-24:29
  * check_interval = 60
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 18:3-18:21
  * check_period = ""
  * check_timeout = null
  * command_endpoint = ""
  * display_name = "satellite-name"
    % = modified in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 6:5-6:33
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = false
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 0
  * flapping_threshold_high = 30
  * flapping_threshold_low = 25
  * groups = [ ]
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 3
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 17:3-17:24
  * name = "satellite-name"
  * notes = ""
  * notes_url = ""
  * package = "_etc"
  * retry_interval = 30
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 19:3-19:22
  * source_location
    * first_column = 0
    * first_line = 1
    * last_column = 25
    * last_line = 1
    * path = "/etc/icinga2/zones.d/satellite/hosts.conf"
  * templates = [ "satellite-name", "generic-host" ]
    % = modified in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 1:0-1:25
    % = modified in '/etc/icinga2/conf.d//templates.conf', lines 16:1-16:28
  * type = "Host"
  * vars
    * geolocation = "45.394499, 11.021408"
      % = modified in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 14:3-14:43
    * notification
      * mail
        % = modified in '/etc/icinga2/conf.d//templates.conf', lines 20:3-23:3
        % = modified in '/etc/icinga2/zones.d/satellite/hosts.conf', lines 8:3-11:3
        * groups = [ "icingaadmins" ]
        * users = [ "usr1" ]
  * volatile = false
  * zone = "satellite"

Is this output more or less equal on both master and satellite?

Yes, the differences are:

  1. ip address obviously, because it’s the satellite itself
  2. path: it’s “/etc/icinga2/conf.d/hosts.conf”
  3. zone: it’s " "

Put it under zones.d/…

How?

On the satellite I have only a README in the zones.d directory and from what I’ve understood it should be like this.
The path it is referring to ( “/etc/icinga2/conf.d/hosts.conf”) is the definition of NodeName in the satellite.

At least remove it from conf.d.

Excuse me, I don’t understand what you are referring to.
Remove what from conf.d? If I remove the definition of NodeName (generated automatically, not written by me), the output of icinga2 object list --type Host on the satellite is nothing.

The only thing I’ve modified on the satellite is the “zones.conf” file in conf.d and I’ve shown the file in my question.

Sorry if this is taking a lot of time, but I still can’t see a solution.

Please share your /etc/icinga2/icinga2.conf