Immediately reschedule all services after a host recovers?

Is it possible to immediately reschedule a check of all services on a host after it recovers (i.e. moves from DOWN to OK)? We have a number of services that only run once a day under normal circumstances, but if a host has rebooted we’d like to know immediately if any of those checks have changed.

An example would be a network switch software version check - we don’t need to check this every minute, but if a switch has been rebooted onto a new software version then we’d want to know immediately after it comes back online.

Hi,

you could use an event handler script for this, which detects the state change from NOT-OK to OK and then calls the REST API to force a re-check. That script needs an EventCommand object which then is assigned to the service via event_command attribute.

Cheers,
Michael

Thanks Michael - that’s the conclusion I’d come to as well, only the documentation on EventCommands is a bit sparse. It’s not clear how I’d limit the script to only take that action on host recovery - from the docs:

" Unlike notifications, event commands for hosts/services are called on every check execution if one of these conditions matches:

  • The host/service is in a soft state
  • The host/service state changes into a hard state
  • The host/service state recovers from a soft or hard state to OK/Up"

So that seems to suggest that it would be triggered if any of those conditions matches, whereas we’re only interested in the last one?

Hi,

Event commands are fired on all of the described situations, the user and script must take care themselves on which branch they’d want to actually execute something. They follow the same principle as the old Icinga1/Nagios world … which isn’t that beautiful to handle, but no-one ever came up with a better design and implementation for this feature.

Similar to Check and NotificationCommand objects, EventCommand objects also allow for arguments and those being evaluated inside the script. That allows to pass typical runtime macros such as last_hard_state etc. in order to build the own state machine inside the event handler.

I don’t have good examples at hand, I rarely used event handlers in 10 years :wink:

Cheers,
Michael