Icinga2 monitoring itself (icinga2 server/localhost)

Hi, im getting alerts in icinga2 (icingaweb2) for itself, ie the server icinga2 is running on. This server isn’t set up the same as my other servers/hosts. ie its not in the host files under /etc/icinga2/conf.d.
its complaining about load average being too high, but im not concerned about the results as they aren’t high enough for me to worry about.
I believe this is being run/managed by the services.conf file. but I cant see where the “load” script/plugin is being run from, or how to change the threshold ?

apply Service “load” {
import “generic-service”

check_command = “load”

/* Used by the ScheduledDowntime apply rule in downtimes.conf. */
vars.backup_downtime = “02:00-03:00”

assign where host.name == NodeName
}

Any ideas how to change the threshholds on this check. I’m assuming NodeName is itself.

thanks

You can adapt to your own needs:

https://icinga.com/docs/icinga2/latest/doc/10-icinga-template-library/#load
just add the variables that will solve your issues.

Cheers!

1 Like

You can use

icinga2 object list -n … -t …

to find out where your objects have been defined.

Hi thanks. I have no issues with the nagios plugins for “hosts” as I use for instance
apply Service “load_by_ssh” {
import “generic-service”
check_command = “by_ssh”
vars.by_ssh_logname = “root”
vars.by_ssh_options = “StrictHostKeyChecking=no”
// the [ ] syntax below joins the strings
vars.by_ssh_command = “/ns/tec/unix/u/apps/Icinga2/rhel7-plugins/check_load”
vars.by_ssh_skip_stderr = 0
vars.by_ssh_arguments = {
“-w” = “48,38,33”
“-c” = “70,60,50”}
assign where host.vars.os == “Linux7”
}

But the check in services.conf (which checks the master icinga2 server) has the following

apply Service “load” {
import “generic-service”
check_command = “load”
/* Used by the ScheduledDowntime apply rule in downtimes.conf. */
vars.backup_downtime = “02:00-03:00”
assign where host.name == NodeName
}

I don’t understand where or what to put in this services.conf script to apply thresholds to the “load” check.

vars.load_wload1 = …

for other variables check ITL load

Or even do an if-clause for servers you tolerate high load and the ones you do not.

if ( host.vars.highload ) {
vars.load_warn = “19,22,24”
vars.load_crit = “24,28,30”
} else {
vars.load_warn = “3,5,7”
vars.load_crit = “7,9,12”
}

Hi thanks for the tips. however, im not talking about a regular host. I run ssh-checks (all agentless) for these and apply the following vi a file called ssh-checks.conf…
apply Service “load_by_ssh” {
import “generic-service”
check_command = “by_ssh”
vars.by_ssh_logname = “root”
vars.by_ssh_options = “StrictHostKeyChecking=no”
// the [ ] syntax below joins the strings
vars.by_ssh_command = “/ns/tec/unix/u/apps/Icinga2/rhel7-plugins/check_load”
vars.by_ssh_skip_stderr = 0
vars.by_ssh_arguments = {
“-w” = “48,38,33”
“-c” = “70,60,50”}
assign where host.vars.os == “Linux7”
}

But this doesn’t apply to the icinga2 master server. This appears to be checked via the services.conf file using the following…
apply Service “load” {
import “generic-service”
check_command = “load”
/* Used by the ScheduledDowntime apply rule in downtimes.conf. */
vars.backup_downtime = “02:00-03:00”
assign where host.name == NodeName
}

If I add the following to this sections
vars.load_warn = “19,22,24”
vars.load_crit = “24,28,30”
It makes no difference.

icingaweb reports
WARNING - load average: 2.78, 3.15, 3.23
Servicegroups Load Checks
Performance data
Label Value Warning Critical
load15 3.23 3.00 4.00
load1 2.78 5.00 10.00
load5 3.15 4.00

Appreciate the help…

I am also doing the agentless by ssh , so we have common setups.

There should be a file named host.conf in your master where the master host is defined.
Try adding the variables there

vars.load_warn = “19,22,24”
vars.load_crit = “24,28,30”

and see if these appear on your host definition.

Cheers!

cool. many thanks. I added the following to the services.conf file
vars.load_wload1 = 20
vars.load_wload5 = 19
vars.load_wload15 = 17
vars.load_cload1 = 30
vars.load_cload5 = 28
vars.load_cload15 = 25

and removed the "import “generic-service” line

so

apply Service “load” {
// import “generic-service”
check_command = “load”
vars.load_wload1 = 20
vars.load_wload5 = 19
vars.load_wload15 = 17
vars.load_cload1 = 30
vars.load_cload5 = 28
vars.load_cload15 = 25
/* Used by the ScheduledDowntime apply rule in downtimes.conf. */
vars.backup_downtime = “02:00-03:00”
assign where host.name == NodeName
}

this worked… appreciate the help.

1 Like

Mark then the thread as solved, so people will know.

Cheers!

PS: It would be better if you selected the correct answer as the solution. That will look much nicer.