Notifications Gone Wild

I am trying to setup limits on the notifications that I get when a host is offline.
I have defined the following Check execution parameters in the Host Template.

When the host goes offline Icinga just seems to crank out an unlimited number of notifications one after the other every 15 seconds.

This is the host history when I removed the acknowledgement so Notifications started going out.

image

What am I missing to put on the brakes?

My overall desire for this would be as follows;
1.Check the host every 2 minutes.
2. If the host goes offline, send a notification
3. Check again in 5 minutes, if it is still offline send another notification (do this 2 more times)
4. If the host recovers, send a notification.

Any help or guidance is appreciated.

what are the settings for the notification apply rule and the notification template?

I think your question has put me in the right direction.

This is the notification template.

This the notification apply.

I adjusted the values in the apply and the notifications have calmed down.

In terms of precedence which settings take priority?

In regards to precedence here is an example of what I mean.

I want the server Transcoder 1 to be ignored by my general hosts rule and only send notifications based on the second rule.

I’ve applied this but the first rule still seems to be in effect.

Hi :slight_smile:

What you currently have set up is:

  1. You check in an interval of 2min
  2. Your host goes offline
  3. It retries in an interval of 5min for 3 attempts
    UP->Down (soft) [1st attempt]
    Down (soft) → Down (soft) [2nd attempt]
    Down (soft) → Down hard [3rd attempt]
  4. with the change to the hard state notifications get triggered.

Your notification template is missing the “transition type” Problem

The notification apply rule you have configured does:

  • delay the first notification after a check reaches the hard state for 120s
  • after this send a notification every 15min (900s) if the check is still in an hard problem state

So if you want to have a notification to be sent every 5 minutes if the host is still down you need to set the notification interval to 5m (or 300s). If you want it to happen immediately clear the notification delay. If you want the additional notification to stop after some time you add 20m (e.g.) to the “last notification” field, so that you don’t get any notifications after 20 minutes of the host being down.

1 Like