Timeout Exceeded for Bash Script

I have implemented a new relatively simple bash script to check fan speed on Cisco ASR9Ks. Average run of the script is ~7 seconds across all agents (to different devices depending on the zone). Some are as high as 15 seconds, but none exceed that.

In the check command properties, I’ve set the timeout to various values, but always get some services that flap between a “normal” output and then one of the following 2 messages:

# this
<Timeout exceeded.>awk: (FILENAME=- FNR=1) warning: error writing standard output (Broken pipe)

# or this
<Terminated by signal 15 (Terminated).>

I’ve played around with different timeout values (again for the check command definition), and it seems like if I increase the timeout value, more of the services flap. If I decrease the timeout value, it seems like more services stabilize. Values tested for timeout were 10, 20, 30, and 60 seconds.

For now I have removed the timeout value, which then seems to apply a default:
check_timeout 20 <— pulled from the inspect feature of the Director module.

I have confirmed that increasing the timeout does nothing, and instead it falls back to the 20 defined above.

When running the command from the terminal for testing, I’m never able to reproduce this.

I am able to find similar issues in community posts, but they’re always focused on increasing the timeout value as opposed to finding out why it’s timing out.

Through some basic troubleshooting, I found that the service apply rule imports a template that forces the 20 second check timeout – I’ve created 2 other templates that utilize a 30 second timeout and a 60 second timeout respectively, but I achieve the same results.

I guess the “question” boils down to: what is causing all of the latency in the check command being issued from Icinga vs the lack of latency when executed manually?

  • Version used (icinga2 --version) 2.13.0-1
  • Operating System and version CentOS 7 3.10.0-1127.8.2.el7.x86_64
  • Enabled features (icinga2 feature list) api checker command ido-mysql influxdb mainlog notification statusdata
  • Icinga Web 2 version and modules (System - About) 2.7.3
  • Config validation (icinga2 daemon -C) REDACTED, but config loads fine.
  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes (SOME NODES May appear to be in 2 zones, or otherwise be in there twice, but they have different domain names that I have redacted)
Object 'icinga02' of type 'Endpoint':
  % declared in '/etc/icinga2/zones.conf', lines 6:1-6:41
  * __name = "icinga02"
  * host = ""
  * log_duration = 86400
  * name = "icinga02"
  * package = "_etc"
  * port = "5665"
  * source_location
    * first_column = 1
    * first_line = 6
    * last_column = 41
    * last_line = 6
    * path = "/etc/icinga2/zones.conf"
  * templates = [ "icinga02" ]
    % = modified in '/etc/icinga2/zones.conf', lines 6:1-6:41
  * type = "Endpoint"
  * zone = ""

Object 'ica02m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 7:1-7:42
  * __name = "ica02m02n"
  * host = "ica02m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 8:5-8:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 10:5-10:21
  * name = "ica02m02n"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 9:5-9:17
  * source_location
    * first_column = 1
    * first_line = 7
    * last_column = 42
    * last_line = 7
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica02m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 7:1-7:42
  * type = "Endpoint"
  * zone = "master"

Object 'ica01m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 1:0-1:41
  * __name = "ica01m02n"
  * host = "ica01m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 2:5-2:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 4:5-4:21
  * name = "ica01m02n"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 3:5-3:17
  * source_location
    * first_column = 0
    * first_line = 1
    * last_column = 41
    * last_line = 1
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica01m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 1:0-1:41
  * type = "Endpoint"
  * zone = "master"

Object 'ica03m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 25:1-25:42
  * __name = "ica03m02n"
  * host = "ica03m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 26:5-26:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 28:5-28:21
  * name = "ica03m02n.nsvltn"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 27:5-27:17
  * source_location
    * first_column = 1
    * first_line = 25
    * last_column = 42
    * last_line = 25
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica03m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 25:1-25:42
  * type = "Endpoint"
  * zone = "master"

Object 'ica04m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 31:1-31:42
  * __name = "ica04m02n"
  * host = "ica04m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 32:5-32:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 34:5-34:21
  * name = "ica04m02n"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 33:5-33:17
  * source_location
    * first_column = 1
    * first_line = 31
    * last_column = 42
    * last_line = 31
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica04m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 31:1-31:42
  * type = "Endpoint"
  * zone = "master"

Object 'ica01m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 13:1-13:42
  * __name = "ica01m02n"
  * host = "ica01m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 14:5-14:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 16:5-16:21
  * name = "ica01m02n"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 15:5-15:17
  * source_location
    * first_column = 1
    * first_line = 13
    * last_column = 42
    * last_line = 13
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica01m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 13:1-13:42
  * type = "Endpoint"
  * zone = "master"

Object 'ica02m02n' of type 'Endpoint':
  % declared in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 19:1-19:42
  * __name = "ica02m02n"
  * host = "ica02m02n"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 20:5-20:37
  * log_duration = 86400
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 22:5-22:21
  * name = "ica02m02n"
  * package = "director"
  * port = "5665"
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 21:5-21:17
  * source_location
    * first_column = 1
    * first_line = 19
    * last_column = 42
    * last_line = 19
    * path = "/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf"
  * templates = [ "ica02m02n" ]
    % = modified in '/var/lib/icinga2/api/packages/director/46b75cce-5f66-49dc-9ea0-3f5dfd5daaa7/zones.d/master/endpoints.conf', lines 19:1-19:42
  * type = "Endpoint"
  * zone = "master"

Edit to add:
Device level troubleshooting shows no packetloss (common with SNMP errors we see), and other SNMP checks all perform fine. We walk a pretty large table which is why it’s 7-15 seconds for check execution when run manually.

Hello @steaksauce!

It seems you’ve hit this one:

https://github.com/Icinga/icinga2/issues/8703

Any way to (partially) parallelize the script’s work?

Best,
AK

Oof, I forgot to update this one –

this was actually caused by, you guessed it, the device timing out.

I found that the devices I was testing against were fine, and that I made some assumptions on the run time across all devies. In addition to this, there was some packet loss introduced on the network.

TLDR; I increased the timeout to 90 seconds, and increased check intervals to 30 minutes (not like fans fail all day).