Automatic downtime when host is down?

We are monitoring all our machines with Icinga2 through Puppet. Not all machines are on or connected to a network all the time. So Laptops or developer machines get shut down or carried around. Is there a way to automatically acknowledge or set a flexible downtime for those? So that they are checked when they are reachable, but not checked and not reported in the frontend or through notifications when they are not reachable?

Hi,

which event can be used to detect whether the host is down right now? E.g. inject into the shutdown routine with scheduling a downtime via the REST API in a script, deployed and managed by Puppet.

Cheers,
Michael

Well, when they’re down then they don’t respond to pings anymore. Basically, I want the DOWN state that is already detected by Icinga (naturally, through the hostalive check I believe) to lead to automatic “acknowledged” or “downtime” state.

Hi,

you can use event commands/handlers for this. On each state change, a script is run. Depending on the state type and attempt count, you can e.g. fire a curl request against the REST API and acknowledge a problem or schedule a downtime. This needs a little scripting of yours though, this the example api clients in the docs chapter in mind.

https://icinga.com/docs/icinga2/latest/doc/03-monitoring-basics/#event-commands
https://icinga.com/docs/icinga2/latest/doc/12-icinga2-api/#actions
https://icinga.com/docs/icinga2/latest/doc/12-icinga2-api/#api-clients

Cheers,
Michael

1 Like

Thanks Michael for the tip!

We also have a usecase for an automatic acknowledgement if a host reach the hard-state.
I implemented a script which send a curl to the master. It works, but I am wondering about the entries in the icinga2.log.
Do you have any idea why there are warnings in the log?

warning/PluginEventTask: Event command for object ‘Hostname’ (PID: 8626, arguments: ‘/etc/…/send_acknowledgement.sh’ ‘Hostname’ ‘HARD’ ‘DOWN’) terminated with exit code 128, output: execvpe(/etc/…/send_acknowledgement.sh) failed: No such file or directory

Can you share the exact EventCommand definition?

Of Course:

object EventCommand “send_acknowledgement” {
import “plugin-event-command”
command = [
“/etc/…/send_acknowledgement.sh”,
“$host.name$”,
“$host.state_type$”,
“$host.state$”
]
}

Is that the real path or just censored? What happens if you run ls -la on that path?
Likewise, the service object where this event_command is configured, does this involve command_endpoint?

Cheers,
Michael

That’s a censored path, but now I solved the Problem!

The EventCommand runs on the satellite server, so first I put the script only on the satellite. Now I put the script also on the master server under the same path and the log now looks better.

[2019-08-29 13:40:07 +0200] warning/PluginEventTask: Event command for object ‘Router’ (PID: 21087, arguments: ‘/etc/…/send_acknowledgement.sh’ ‘Router’ ‘SOFT’ ‘DOWN’) terminated with exit code 1, output:
[2019-08-29 13:40:21 +0200] warning/PluginEventTask: Event command for object ‘Router’ (PID: 21118, arguments: ‘/etc/…/send_acknowledgement.sh’ ‘Router’ ‘SOFT’ ‘DOWN’) terminated with exit code 1, output:
[2019-08-29 13:40:26 +0200] warning/PluginEventTask: Event command for object ‘Router’ (PID: 21143, arguments: ‘/etc/…/send_acknowledgement.sh’ ‘Router’ ‘SOFT’ ‘DOWN’) terminated with exit code 1, output:
[2019-08-29 13:40:29 +0200] warning/PluginEventTask: Event command for object ‘Router’ (PID: 21151, arguments: ‘/etc/…/send_acknowledgement.sh’ ‘Router’ ‘SOFT’ ‘DOWN’) terminated with exit code 1, output:
[2019-08-29 13:40:30 +0200] information/ConfigObjectUtility: Created and activated object ‘Router!fbce912c-14de-4d96-8d39-47094345f2b5’ of type ‘Comment’.

The script only send an acknowledgement after reaching the HARD DOWN State, in our case this is usually after 5 check attempts. The icinga2.log shows only event entries for Events terminated with Exit codes not equal zero, I think thats normal!?

Can you share the script? The exit code of 1 is treated as an error, exit 0 means everything is ok.

Ahh now I see that the Event command Messages are “warnings”, thats the reason why they are in the icinga2.log. This means that a successfully executed Event command is not displayed in the icinga2.log, now I understand it. :smiley:

I’ll change my script, to exit with 0 if the state is SOFT and DOWN, thus the messages no longer appear in the icinga2.log

Warnings typically need attention by the admin.

If the command executed ok, this is visible with notice/debug severity not populated into the main application log in icinga2.log. Instead, you’ll get such with the debug log feature for example. You can also modify the feature configuration and add an additional notice logger, if you prefer that.

vim /etc/icinga2/features-enabled/debuglog.conf

object FileLogger "notice-file" {
  severity = "notice"
  path = LogDir + "/notice.log"
}

Cheers,
Michael

1 Like

Thanks for the hint!