Hi,
I’ve been struggling to figure out why the host template filed (array) I defined works “on/off” during the course of the day. It almost seems there is a latency with resolving the value fro “nn_mounts” that is used in a local check. I’m not sure what all to provide here, so please let me know if I am missing something. Am I defining things in the wrong place?
Director data fields entry
Host template
zones.d/director-global/host_templates.conf
template Host "hadoop-vars" {
vars.nn_mounts = [ "/hadoop" ]
}
Inherited by:
zones.d/director-global/host_templates.conf
template Host "Standard Linux Server" {
import "host-vars"
import "hadoop-vars"
check_command = "ping4"
}
Command: check_disk: https://github.com/nagios-plugins/nagios-plugins
zones.d/director-global/commands.conf
object CheckCommand "check-disk-filedrop" {
import "plugin-check-command"
command = [ PluginDir + "/check_disk" ]
timeout = 1m
arguments += {
"-A" = {
description = "Explicitly select all paths. This is equivalent to -R .*"
order = 1
set_if = "$disk_all$"
}
"-C" = {
description = "Clear thresholds"
set_if = "$disk_clear$"
}
"-E" = {
description = "For paths or partitions specified with -p, only check for exact paths"
set_if = "$disk_exact_match$"
}
"-I" = {
description = "Regular expression to ignore selected path/partition (case insensitive) (may be repeated)"
order = 2
repeat_key = true
value = "$disk_ignore_eregi_path$"
}
"-K" = {
description = "Exit with CRITICAL status if less than PERCENT of inode space is free"
order = -3
value = "$disk_inode_cfree$"
}
"-L" = {
description = "Only check local filesystems against thresholds. Yet call stat on remote filesystems to test if they are accessible (e.g. to detect Stale NFS Handles)"
set_if = "$disk_stat_remote_fs$"
}
"-M" = {
description = "Display the mountpoint instead of the partition"
set_if = "$disk_mountpoint$"
}
"-R" = {
description = "Case insensitive regular expression for path/partition (may be repeated)"
repeat_key = true
value = "$disk_eregi_path$"
}
"-W" = {
description = "Exit with WARNING status if less than PERCENT of inode space is free"
order = -3
value = "$disk_inode_wfree$"
}
"-X" = {
description = "Ignore all filesystems of indicated type (may be repeated)"
repeat_key = true
value = "$disk_exclude_type$"
}
"-c" = {
description = "Exit with CRITICAL status if less than INTEGER units of disk are free or Exit with CRITCAL status if less than PERCENT of disk space is free"
order = -3
required = true
value = "15%"
}
"-e" = {
description = "Display only devices/mountpoints with errors"
set_if = "$disk_errors_only$"
}
"-f" = {
description = "Don't account root-reserved blocks into freespace in perfdata"
set_if = "$disk_ignore_reserved$"
}
"-g" = {
description = "Group paths. Thresholds apply to (free-)space of all partitions together"
value = "$disk_group$"
}
"-i" = {
description = "Regular expression to ignore selected path or partition (may be repeated)"
order = 2
repeat_key = true
value = "$disk_ignore_ereg_path$"
}
"-k" = {
description = "Same as --units kB"
set_if = "$disk_kilobytes$"
}
"-l" = {
description = " Only check local filesystems"
set_if = "$disk_local$"
}
"-m" = {
description = "Same as --units MB"
set_if = "$disk_megabytes$"
}
"-p" = {
description = "Path or partition (may be repeated)"
order = 1
repeat_key = true
value = "$disk_partitions$"
}
"-p_old" = {
order = 1
value = "$disk_partition$"
}
"-r" = {
description = "Regular expression for path or partition (may be repeated)"
repeat_key = true
value = "$disk_ereg_path$"
}
"-t" = {
description = "Seconds before connection times out (default: 10)"
value = "$disk_timeout$"
}
"-u" = {
description = "Choose bytes, kB, MB, GB, TB (default: MB)"
value = "$disk_units$"
}
"-w" = {
description = "Exit with WARNING status if less than INTEGER units of disk are free or Exit with WARNING status if less than PERCENT of disk space is free"
order = -3
required = true
value = "25%"
}
"-x" = {
description = "Ignore device (only works if -p unspecified)"
value = "$disk_partitions_excluded$"
}
"-x_old" = "$disk_partition_excluded$"
}
vars.disk_cfree = "10%"
vars.disk_exclude_type = [
"none",
"tmpfs",
"sysfs",
"proc",
"configfs",
"devtmpfs",
"devfs",
"mtmfs",
"tracefs",
"cgroup",
"fuse.gvfsd-fuse",
"fuse.gvfs-fuse-daemon",
"fdescfs",
"overlay",
"nsfs",
"squashfs"
]
vars.disk_megabytes = true
vars.disk_wfree = "20%"
}
I tried reordering the field priority and making ‘-p’ always required, but alas, no luck.
Event history snippet:
Sunday, October 27, 2019
DOWNTIME START
14:00:00
[icingaadmin] Scheduled downtime for Sunday maintenance
Monday, October 21, 2019
OK
15:14:01
DISK OK - free space: /hadoop 226187 MB (44.19% inode=100%);
UNKNOWN
15:13:02
[ 1/7 ] Error: Non-optional macro 'nn_mounts' used in argument '-p' is missing.
OK
15:09:03
DISK OK - free space: /hadoop 226187 MB (44.19% inode=100%);
UNKNOWN
15:08:04
[ 1/7 ] Error: Non-optional macro 'nn_mounts' used in argument '-p' is missing.
OK
15:04:04
DISK OK - free space: /hadoop 226187 MB (44.19% inode=100%);
UNKNOWN
15:03:05
[ 1/7 ] Error: Non-optional macro 'nn_mounts' used in argument '-p' is missing.
OK
14:59:06
DISK OK - free space: /hadoop 226187 MB (44.19% inode=100%);