One of many services is in Pending state

Hi all,

I’ve a setup with a master, some satellites and agents on maybe 30 machines. The setup is working fine. All agents are talking to their satellites and all satellites are talking to the master. Service checks are working and executed on the agents. Everything looks pretty. But now I found out, that one (actually two) check (Ping4, Ping6) is in pending state on new hosts. The host itself is green (hostalive) and all other checks are normally working (disk usage, cpu usage, etc). But that checks are normally running on the agent itself (command_endpoint = host.name).

After some trials with other check_commands in the service - now all hosts are in pending state for service ping4/6. The service is in pending state for ever. I changed the check_command to hostalive, but it doesn’t help. The check is part of the default configuration within conf.d/services.conf.

apply Service "Ping4" {
  import "generic-service"
  check_command = "ping4"
  assign where host.address
}

All hosts are configured as follows in zone.d/host1/hosts.conf:

object Host "host1" {
  import "generic-host"
  check_command = "hostalive"
  address = "<ipv4>"
  address6 = "<ipv6>"
  ...

This is totally strange. Of course I can ping all hosts. The service check otherwise would be working. Also I tried to “check now” and “reschedule force” etc. in IcingaWeb, but nothing happens. No error or messages in /var/log/icinga2/icinga2.log every where. Also a reload doesn’t change the state. It looks like, the service check is never executed but there comes no message to understand, why this check is not working.

  • Version used (icinga2 --version)

on master r2.12.5-1
on agents r2.13.1-1

(I know this isn’t good, but the master is running inside a docker container, and the image is not yet updated). But the pending service checks are pending since march and at that time the icinga version was the same. Of course the ping command is working inside the container too.

  • Operating System and version

Debian-10/11

  • Enabled features (icinga2 feature list)

Enabled features: api checker command compatlog graphite ido-mysql livestatus mainlog notification

  • Icinga Web 2 version and modules (System - About)

2.9.0

  • Config validation (icinga2 daemon -C)
[2021-10-08 10:50:54 +0200] information/cli: Icinga application loader (version: r2.12.5-1)
[2021-10-08 10:50:54 +0200] information/cli: Loading configuration file(s).
[2021-10-08 10:50:54 +0200] information/ConfigItem: Committing config item(s).
[2021-10-08 10:50:54 +0200] information/ApiListener: My API identity: xxxxx
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 29 Hosts.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 2 Downtimes.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 6 NotificationCommands.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 94 Comments.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1948 Notifications.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 7 HostGroups.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 26 Zones.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 25 Endpoints.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 2 ApiUsers.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 CompatLogger.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 250 CheckCommands.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 1 LivestatusListener.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 3 TimePeriods.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 3 UserGroups.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 3 Users.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 679 Services.
[2021-10-08 10:50:55 +0200] information/ConfigItem: Instantiated 41 ServiceGroups.
[2021-10-08 10:50:55 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-10-08 10:50:55 +0200] information/cli: Finished validating the configuration file(s).
  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes

This is a long list, but here is an example

object Zone "global-templates" {
  global = true
}

object Endpoint "xxxxx" {
}

object Zone "master" {
  endpoints = [ "xxxxx" ]
}

object Endpoint "host1" {
    host = "host1"
    port = "5665"
}

object Zone "host1" {
    parent = "master"
    endpoints = [ "host1" ]
}

When I define a service like

apply Service "Ping IPv4" {
  import "generic-service"

  check_command = "ping4"
  command_endpoint = host.name

  assign where host.address && host.name == "host1"
}

then, the check is running. It looks like the service definition without command_endpoint is never executed (but it should be run on master). The problem maybe that the zone master contains the endpoint “xxxxx”, but the Endpoint for xxxx doesn’t have a host. Am I wrong?

You’re using an unsupported combination of versions, details can be found here.

Did you read my comment on that? That may not be “supported”, but this can’t be the issue, because the check should be run on master ONLY. It’s not relevant, that the agent has a different version here.

The docker image was updated and all icinga instances are now on version r2.13.1-1

As written - the issue is still there. The check is executed on the master on all hosts, that are part of a zone. I’ve different hosts (without an agent) where the ping check works. This hosts are not defined in the zone.d but in conf.d.