Check command does not exist, why?

Woefdram · January 31, 2023, 11:34am

I’ve read several articles about this problem, but I can’t seem to solve it on my side.

I use Icinga2 Director, and the T-Systems Ansible playbooks to configure my hosts, commands, templates and agents. But every time I get back to the situation in which Icinga Web tells me that “check command X does not exist”.

The commands are actually there, permissions are OK, and when I run them manually as user Nagios (running Debian Bullseye here), they do exactly what they’re supposed to do. The reason for this tenacious error must therefore be somehwere in the configuration of Icinga or the client.

Icinga comes with a load of “external commands”, and strangely enough those do work like a charm. It’s the normal plugin commands that aren’t found. I can, for example, use the plugin “disk”, which shows me the usage of the different volumes I have configured on a host. But if I use “check_load”, a plugin that is available on the host, and to which I point with an absolute path, I get the aforementioned error.

I don’t have a lot of experience with Icinga yet, and I’m sure it must be something small, but I’ve been struggling with this for days now and getter rather frustrated. Colleages are already preparing an old-fashioned Nagios-server to do the monitoring of this collection of machines, because “that works”, but I would hate to drop Icinga for Nagios.

Any hints? Where do I look for errors, are zones somehow involved? I’m using 1 master and a bunch of clients, so I don’t think I should bother too much with zones, but please correct me if I’m wrong here.

rivad · January 31, 2023, 2:11pm

If I get this error, most of the time I want to run a check that only exists on the master but I forgot to tell Icinga2 where to run the check and by default (edit) my director setup (/edit) tries on the client.

Woefdram · January 31, 2023, 4:10pm

I’m trying to run these checks on the client (I need to monitor several hundreds of machines). The checks I try to run are actually there. They are all installed as part of the default installation of these VMs: every VM gets a set of standard Nagios plugins, plus a set of custom made ones.

They’re all there, permissions are OK and if run manually, they give the correct output. I’ve defined my commands with the absolute path to those plugins, so “does not exist” is strictly seen not the correct error message.

rivad · January 31, 2023, 5:05pm

Can you check the check source:

Is it really the client?
Can the icinga/nagios user execute the check on the client?
Anything in the logs?

log1c · February 1, 2023, 8:12am

To run checks on the client you need to set the command_endpoint option in the service definition to the endpoint object name of that client. This usually works with command_endpoint = host_name

This will then be reflected in the Check Source on the web interface, like @rivad posted.

rsx · February 1, 2023, 9:05am

…means the the corresponding check command is not defined at the machine where the check should run. Why it is missing depends on how you arrange your object definitions. If you use director to define them the zone selected in Cluster Zone needs to be defined at your agents. If Cluster Zone is empty director-global is used.