Set service only on active host

dcz01 · January 14, 2025, 9:47am

Hi,

Is there an chance to define something in the icinga2 config for some services which actually are only bound to one host so that they could be used on both hosts but are only checked on the active host of both in an ha setup like this?

object Zone "satellite" {
  endpoints = [ "icinga2-satellite1.localdomain", "icinga2-satellite2.localdomain" ]
}

Actually:

object Service "cluster" {
    check_command = "cluster"
    check_interval = 5s
    retry_interval = 1s
    
    host_name = "icinga2-satellite1.localdomain"
    host_name = "icinga2-satellite2.localdomain"
}

object Service "Connection Icinga-master - Icinga Zone-1" {
  check_command = "cluster-zone"
  check_interval = 5s
  retry_interval = 1s
  vars.cluster_zone = "icinga-zone1"
  
  host_name = "icinga2-satellite1.localdomain"
  host_name = "icinga2-satellite2.localdomain"
}

Greedings
dcz01

moreamazingnick · January 14, 2025, 10:40am

you can use icingadsl to query a service from another host and use the checkresult and output as the other services output

rivad · January 14, 2025, 2:49pm

I use a “virtual” Icinga host object with the HA IP.
This has the 116_cluster_nodes variable set and the same service name as on the HA nodes with the following DSL code as check.

object CheckCommand "116-cmd-only-one" {
    import "plugin-check-command"
    command = [ "/usr/lib64/nagios/plugins/dummy" ]
    timeout = 10s
    arguments += {
        "--message" = {
            description = "Message"
            required = true
            value = {{
                var output_status = ""
                var up_count = 0
                var down_count = 0
                var cluster_nodes = macro("$116_cluster_nodes$")
                var only_one_service_name = macro("$116-cluster-only-one-service$")
            
                for (node in cluster_nodes) {
                  if (get_service(node, only_one_service_name).state > 0) {
                    down_count += 1
                  } else {
                    up_count += 1
                  }
                }
            
                if (up_count == 1) {
                  output_status = "OK: "
                } else {
                  output_status = "CRITICAL: "
                }
            
                var output = output_status
            
                for (node in cluster_nodes) {
                  output += node + ": " + only_one_service_name + ": " + get_service(node, only_one_service_name).last_check_result.output + " "
                }
            
                output += " | count_of_alive_" + only_one_service_name +"="+up_count+";1:1;1:1"
                return output
            }}
        }
        "--state" = {
            description = "State"
            value = {{
                var up_count = 0
                var down_count = 0
                var cluster_nodes = macro("$116_cluster_nodes$")
                var only_one_service_name = macro("$116-cluster-only-one-service$")
            
                for (node in cluster_nodes) {
                  if (get_service(node, only_one_service_name).state > 0) {
                    down_count += 1
                  } else {
                    up_count += 1
                  }
                }
            
                if (up_count == 1) {
                  return "ok" //same up as down -> UP
                } else {
                  return "crit" //something is broken
                }
            }}
        }
    }
}

dcz01 · January 15, 2025, 11:46am

Thanks for the good answers.
I only had the problem to auto switch over between the masters in the master zone for the checks but that is now working fine after i found the correct documentation:
Distributed Monitoring - Icinga 2

Regards
dcz01