Trying to resolve strange issue…
All my RHEL7 clients work flawlessly!
All my RHEL6 clients work for a few min, then go to a UNKNOWN state and remain that way.
I have some custom service checks that are listed in my commands.conf (see below). The commands.conf is located on my master server in /etc/icinga2/zones.d/global-templates/commands.conf
All clients have the api.conf set to:
object ApiListener "api" {
accept_commands = true
accept_config = true
}
I have also verified the commands.conf is being sent to the clients
[user@nagiostst ]$ sudo grep -E ‘disk_usage|mem_used’ /var/lib/icinga2/api/zones/global-templates/_etc/commands.conf
object CheckCommand “mem_used” {
object CheckCommand “disk_usage” {
Example:
RHEL6 service checks
RHEL7 service checks
These are the custom commands in commands.conf
object CheckCommand "mem_used" {
import "plugin-check-command"
command = [PluginDir + "/check_memory.py"]
arguments = {
"-W" = "$mem_warn$"
"-C" = "$mem_crit$"
}
}
object CheckCommand "disk_usage" {
import "plugin-check-command"
command = [PluginDir + "/check_disk_usage.py"]
arguments = {
"-D" = "$disk$"
"-W" = "$warn$"
"-C" = "$crit$"
}
}
The plugins are present in nagios plugins dir on each client.
[user@nagiostst]$ ls -la /usr/lib64/nagios/plugins/*.py
-rwxr-xr-x. 1 root root 1854 Oct 9 00:06 /usr/lib64/nagios/plugins/check_disk_usage.py
-rwxr-xr-x. 1 root root 3476 Oct 9 00:06 /usr/lib64/nagios/plugins/check_memory.py
This is the client conf file on the master
object Endpoint "nagiostst" {
host = "1.2.3.4"
}
object Zone "nagiostst" {
endpoints = [ "nagiostst" ]
parent = "master"
}
object Host "nagiostst" {
import "icon_rhel_vhost"
import "unix"
address = "1.2.3.4"
vars.os = "linux"
vars.disks["disk"] = { /* No parameters. */ }
vars.filesystem["/"] = {}
vars.filesystem["/export"] = {}
vars.filesystem["/var"] = {}
vars.filesystem["/var/log"] = {}
vars.client_endpoint = name
vars.program = "unixadm"
vars.location = "xxxxx"
vars.type = "virtual"
}
dnsmichi
(Michael Friedrich)
October 23, 2019, 7:02am
2
Verify that the daemon actually reads the commands, e.g. with icinga2 daemon -C -x notice | grep commands.conf
. Also, please share the service apply rules in place here.
Which Icinga versions are involved?
Cheers,
Michael
Appreciate the quick response Michael!
Here’s the output:
Client - RHEL6 :
[user@nagiostst ]$ sudo icinga2 daemon -C -x notice | grep commands.conf
Empty return
[user@nagiostst ]$ sudo icinga2 -V | head -1
icinga2 - The Icinga 2 network monitoring daemon (version: 2.11.0-1 )
Master :
[user@icinga01 ]$ sudo icinga2 daemon -C -x notice | grep commands.conf
[2019-10-23 09:36:20 -0600] notice/ConfigCompiler: Compiling config file: /etc/icinga2/zones.d/global-templates/commands.conf
[user@icinga01 ~ 14272]$ sudo icinga2 -V | head -1
icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.5-1 )
And a RHEL7 Client:
[user@unixadm ]$ sudo icinga2 daemon -C -x notice | grep commands.conf
[2019-10-23 09:40:36 -0600] notice/ConfigCompiler: Compiling config file: /var/lib/icinga2/api/zones/global-templates/_etc/commands.conf
[user@unixadm ]$ sudo icinga2 -V | head -1
icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.5-1 )
Here are the services:
NOTE: the services.conf is located in /etc/icinga2/zones.d/master/services.conf
apply Service "Disk Usage: " for (filesystem_name => config in host.vars.filesystem) {
import "service-check-alarm-settings"
check_command = "disk_usage"
command_endpoint = host.vars.client_endpoint
vars.grafana_graph_disable = true
vars.disk = filesystem_name
if (!vars.warn) { vars.warn = "80" }
if (!vars.crit) { vars.crit = "90" }
vars += config
assign where host.address && host.vars.os == "linux"
}
apply Service "Memory Used" {
import "service-check-alarm-settings"
check_command = "mem_used"
command_endpoint = host.vars.client_endpoint
vars.mem_warn = "80"
vars.mem_crit = "90"
enable_notifications = false
assign where host.address && host.vars.os == "linux"
}
dnsmichi
(Michael Friedrich)
October 23, 2019, 4:49pm
4
You might see a bug fixed in 2.11.1. Try upgrading the agent, and purge away /var/lib/icinga2/api/{zones,zones-stage}/*
then restart.
Gave that a try…
Looks much better! HUGE THANKS!
[user@nagiostst ]$ sudo yum -y update
[user@nagiostst ]$ sudo rm -rf /var/lib/icinga2/api/zones/*
[user@nagiostst ]$ sudo rm -rf /var/lib/icinga2/api/zones-stage/*
[user@nagiostst ]$ sudo ls /var/lib/icinga2/api/zones
[user@nagiostst ]$ sudo reboot
[user@nagiostst ]$ sudo icinga2 -V | head -1
icinga2 - The Icinga 2 network monitoring daemon (version: 2.11.1-1)
[user@nagiostst ]$ sudo icinga2 daemon -C -x notice | grep commands.conf
[2019-10-23 15:38:40 -0600] notice/ConfigCompiler: Compiling config file: /var/lib/icinga2/api/zones/global-templates/_etc/commands.conf
dnsmichi
(Michael Friedrich)
October 24, 2019, 7:22am
6
Cool, thanks for confirming. I’ve added these details to the upgrading docs to allow everyone to follow along.
Icinga:master
← Icinga:feature/upgrading-docs-bugfixes-2-11
opened 07:18AM - 24 Oct 19 UTC