I have a command check_mountpoints which can takes longer because of some slow NFS mounts, so I’m trying to modify its timeout making it bigger than 1 minutes default.
I tried to set check_timeout = 2m in the apply Service definition and timeout = 2m in the object CheckCommand definition, but this is not applied because sometimes the command is killed after 60 seconds (<Timeout exceeded.><Terminated by signal 9 (Killed).>).
Please note that this is a remote check executed on a client through Icinga Agent (command_endpoint = host.vars.client_endpoint in the apply Service definition.
Could you help me to understand where I’m wrong, please?
object CheckCommand "_check_mountpoints" {
command = [ CustomPluginsDir + "/check_mountpoints.sh" ]
arguments = {
"--mountpoint" = {
value = "$check_mountpoints_mountpoint$"
description = "list of mountpoints to check. Ignored when -a is given"
skip_key = true
order = -1
}
"-m" = {
value = "$check_mountpoints_mtab$"
description = "Use this mtab instead (default: /proc/mounts)"
}
"-f" = {
value = "$check_mountpoints_fstab$"
description = "Use this fstab instead (default: /etc/fstab)"
}
"-N" = {
value = "$check_mountpoints_fs_field$"
description = "FS Field number in fstab (default: 3)"
}
"-M" = {
value = "$check_mountpoints_mount_field$"
description = "Mount Field number in fstab (default: 2)"
}
"-O" = {
value = "$check_mountpoints_option_field$"
description = "Option Field number in fstab (default: 4)"
}
"-T" = {
value = "$check_mountpoints_nfs_timeout$"
description = "Responsetime at which an NFS is declared as staled (default: 3)"
}
"-L" = {
set_if = "$check_mountpoints_softlinks$"
description = "Allow softlinks to be accepted instead of mount points"
}
"-i" = {
set_if = "$check_mountpoints_ignore_fstab$"
description = "Ignore fstab. Do not fail just because mount is not in fstab. (default: unset)"
}
"-a" = {
set_if = "$check_mountpoints_autoselect_mounts$"
description = "Autoselect mounts from fstab (default: unset)"
}
"-A" = {
set_if = "$check_mountpoints_fstab_autoselect$"
description = "Autoselect from fstab. Return OK if no mounts found. (default: unset)"
}
"-E" = {
value = "$check_mountpoints_exclude$"
description = "Use with -a or -A to exclude a path from fstab. Use backslash+pipe between paths fo
}
"-o" = {
set_if = "$check_mountpoints_ignore_noauto$"
description = "When autoselecting mounts from fstab, ignore mounts having noauto flag. (default: u
}
"-w" = {
set_if = "$check_mountpoints_writetest$"
description = "Writetest. Touch file $mountpoint/.mount_test_from_$(hostname) (default: unset)"
}
"-e" = {
value = "$check_mountpoints_extra$"
description = "Extra arguments for df (default: unset)"
}
}
timeout = 2m
vars.check_mountpoints_fstab_autoselect = true
vars.check_mountpoints_ignore_noauto = true
}
Hm, that looks good.
As the check_timeout at the service level overrides the timeout of the check command, it is not really needed, I’d say.
Are you really sure that the check is still being killed after 60 seconds? Do you see that time in the webinterface?
Otherwise I would guess that the script is running even longer than two minutes and as both the script timeout and the icinga (command/service) timeout are of the same length, you don’t really know which is responsible.
I would change the timeout of the script call (-T) to be a bit shorter than the command timeout and see if the output changes.
SHARED root@cop ~# icinga2 object list -n _check_mountpoints | grep timeout
* value = "$check_mountpoints_nfs_timeout$"
* timeout = 120
Could the cause of the problem be that this is a remote check and I must modify the timeout value in the client configuration too as this check is executed through Icinga Client?
Syncing is one of the nice features of icinga2. Hence, I’d recommend to define your check command at your master within a global zone that is synced to every agent. Within that definition you could define the default timeout (and overwrite it for a particular service or host if needed).