Active host reboot check

ha3mak · April 4, 2021, 6:49pm

Hi!

I’d like to do an active “Host rebooted?” check. Currently I have a passive check defined in Icinga2 and I’m running scripts on every nodes in every minute to check if the host rebooted. When a reboot is detected the script calls the Icinga2 API and send a passive check result to set the state of the check to “CRITICAL”. With this method I can detect easily an unexpected reboot(e.g. cluster fencing). If I would do it with an active check then my check would be “CRITICAL” for once and on the next time would be “OK” again.

I’d like to change this passive check method to an active check and I have the idea: I write a plugin that is called by my active check in every minute. The plugin can be return with “OK” and with “CRITICAL” when reboot was detected. I would define a condition in my apply Service definition if the state is not “OK” then set enable_active_checks=false, else so It will be “CRITICAL” until I change back to “OK” manually.

I think something like this:

apply Service “Uptime” {
import “generic-service”
command_endpoint = host.vars.client_endpoint

check_command = “custom_uptime”

if ( state != “OK” ) {
enable_active_checks=false
} else {
enable_active_checks=true
}

assign where host.zone == ZoneName && host.vars.os.type == “linux”
}

I’m affraid it’s not possibly what I want because the config evaluates on start/reload only. Is it possible to tell Icinga2 somehow if the check state is not OK then hang up checking and return to periodically checking when I send an “OK” passive check result manually?

Thanks!

theFeu · April 6, 2021, 7:55am

Hello there and welcome back!
For code and configuration, please use markdown formatting for better readability!
Good luck and have a nice day

jbrost · April 9, 2021, 4:07pm

What you’re trying to do is not really what Icinga is designed for. However, you can get close to that by creative use of some features. With some trick in the config language, you can access the result of the last check, allowing you to use it as an input for the following check. So the following is an example of a service that just keeps its state:

apply Service "keep-state" {
	import "generic-service"
	check_command = "dummy"
	var that = this
	vars.dummy_state = function() use(that) {
		return if (that.last_check_result) { that.last_check_result.state } else { 0 }
	}
	assign where ...
}

You could use this to pass the last state as an additional parameter to your check command and if it wasn’t OK, just return the same state again. To reset it, you’d use the process check result function in Web 2 to submit an OK result.

ha3mak · April 10, 2021, 6:55pm

Thanks on your reply Julian. It’s exactly what I tried to achieve. I modified my plugin to handle an extra “-c” argument and when it’s present then exit with critical. With this plugin and the service definition you suggested it works great!