Uptime check: Terminated by signal 9

Hello,

the ncpa uptime check fails with the following message in Icinga2 version: 2.13.2-1:

<Timeout exceeded.><Terminated by signal 9 (Killed).>

Icinga2.log

[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8187 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' 'hub.local' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8199 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' '$host.vars.iloAddress$' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8202 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' '10.116.248.15' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8207 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' 'otn-cloud-nw08.local' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8214 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' '$host.vars.iloAddress$' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Terminating process 8211 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' '10.116.219.75' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 60 seconds
[2022-04-04 14:05:17 +0200] warning/Process: Killing process group 7948 ('sudo' '/usr/lib64/nagios/plugins/check_ncpa.py' '-H' '$host.vars.iloAddress$' '-M' 'system/uptime' '-T' '120' '-c' '36000:' '-t' 'api_key' '-w' '36000:') after timeout of 66 seconds
[2022-04-04 14:05:17 +0200] warning/Process: PID 7948 was terminated by signal 9 (Killed)
object CheckCommand "check_ncpa_uptime" {
   import "plugin-check-command"

  command = [ "sudo", PluginDir + "/check_ncpa.py" ]
   arguments += {

   "-H" = "$address$",
   "-t" = api_token,
   "-t" = "api_key",
   "-M" = "$remote_ncpa_metric$",
   "-w" = "36000:",
   "-c" = "36000:",
   "-T" = "120"
  }
}

service check:

apply Service "ncpa_uptime" {
  import "generic-service"
  check_command = "check_ncpa_uptime"
  vars.remote_ncpa_metric = "system/uptime"
}

All the other NCPA checks are successful.

I have read that sometimes a lack of RAM might cause this, but in my case only ~20% of RAM is in use.
Can anyone help? :slight_smile:

Hello lobr,
If you run check_ncpa.py command manually from the command line how long does it take to get a response back? The command may take longer that the default time out time (60 seconds) to complete. That’s why you receiving time out errors. You can add a longer command timeout to your check command if needed.

object CheckCommand "check_ncpa_uptime" {
   import "plugin-check-command"

  command = [ "sudo", PluginDir + "/check_ncpa.py" ]
   arguments += {

"-H" = "$address$",
 ...
  }
timeout = 5m
}

Regards 
Alex

Hi @aclark6996
Even with that configuration it does not work…
debug.log

[2022-04-06 13:52:30 +0200] debug/InfluxdbWriter: Checkable 'otn-ac-monp-db03' adds to metric list:'hostalive,hostname=otn-ac-monp-db03,metric=rta unit="seconds",value=0.001865 1649245950'.
[2022-04-06 13:52:30 +0200] debug/InfluxdbWriter: Checkable 'otn-ac-monp-db03' adds to metric list:'hostalive,hostname=otn-ac-monp-db03,metric=pl unit="percent",value=0 1649245950'.

icingaweb2:
grafik

Sometimes it turns to be a warning instead of UNKOWN but then it switches back again…
grafik

But directly on the console (master or satellite, doesn’t matter), the check works!
master:

[root@otn-ac-monp-ma01 icingaweb2]# /usr/lib64/nagios/plugins/check_ncpa.py -H 10.116.220.209 -t w2vf2QOLsap9VyD2 -M 'system/uptime' --timeout=5
OK: Uptime was 2 days 1 hour 59 minutes 52 seconds | 'uptime'=179992.00s;;;

Satellite:

[root@otn-ac-monp-sa01 ~]# /usr/lib64/nagios/plugins/check_ncpa.py -H 10.116.220.209 -t w2vf2QOLsap9VyD2 -M 'system/uptime' --timeout=5
OK: Uptime was 2 days 2 hours 53 seconds | 'uptime'=180053.00s;;;

Might this be an icingaweb2 display issue?

You run the plugin manually a user root. Icinga runs plugins as user nagios or icinga (depending on your distribution). In your case even sudo is involved as the plugin output indicates. In this case you obviously don’t have enough permission and the plugin is killed once the timeout is reached.

To test plugins you could use e.g.:

sudo -u nagios /usr/lib64/nagios/plugins/check_ncpa.py …