Several questions about flapping


i’m currently working on a poorly handovered Icinga2 server. Flapping wasn’t impleented until yet but i want to do it. For me there are several things unclear. So a view questions came up. I hope you guys can help me out to understand:

  1. What happens when the button “Flap Detection” will be activated in webui?

What is setup by default? Wich tresholds will be used? Is this an empty button so that flapping needs to be enabled in some config files globally first until that button works?

  1. How can i set flapping for one service check?

We do monitor a huge environment so for me it is not possible just to enable flapping in app.conf globally. Would this be the right solution?

  if ( == "<MY-Hostname>") {
    enable_flapping = true
    flapping_threshold_low = 25
    flapping_threshold_high = 40

Would set this up within a service check configuration. If that is right, are there other ways to enable it for a complete environment or all service checks?

  1. Where can i see the calculation flapping does during the last 20 checks state changes? This would be important to adjust the thresholds right. Simply in a config file?

  2. What happens when a host has been detected as flapping? Will the status be printed out as green?

  3. The most important one for a huge environment like this:

I would like to simply enable flapping for all checks in our environment but want also to receive all alerts (like warnings, criticals and green status), so that we will simply informed when an check is flapping but nothing else changes. Then we receive these mails and can adjust the thresholds to the right level to get meaningful and real alerts. How does that work?

I know lot’s of questions but i’m a little bit lost and hope to get tangible and clear information how to set it up in the right way.

Thank you in advance!

Hi and welcome!

Why not add the flapping configuration to your temlates? This way you can easily (de-) activate it for all Services and change the thresholds. Why would you encase it in an if clause?

You can’t “see” the calculation it’s doing. The algorithm is rather complicated and is explained more into detail in the Icinga 2 book (sorry, only in German so far).

(affiliate link)

The algorithm takes into account the last check results and weighs them so the more recent a result is the more impact it has on the calculation. You will get a notification about the Service starting (or ending flapping) and while it is still flapping you won’t get any new notification about state changes which is the main reason why flapping was introduced in the first place. It’s for keeping Icinga from spamming with Ok-NotOk-Ok-NotOk mails.

Is that ok for a short answer? I don’t have time for a longer one, so maybe someone else might step in. If you need more detail, let me know I might have more time later or in the next days (no promises)

Hi Thomas,

thank you for the book recommendation of the book you wrote. For a short answer it is ok but it doesn’t answer my questions i have. Hopefully there is someone who might step in at this point.

Thank you for your help.

Nobody around? @twidhalm, mabye you have now a little bit time to answer my questions? That would be really nice :slight_smile:

As Thomas said, the actual calculation can’t be seen.
So the tresholds will be trial and error, unitl you get the correct ones.

A blog post and the docs about flapping, but I guess you have already read those:

Flapping does not change the state, it just disables notifications in the timeframe when flapping is detected. For that you can get notifications when the flapping starts and stops.

Thank you @log1c for the links. But i asked several questions about flapping in an official Iicinga forum. I hoped it would be somebody around here who is able to explain a bit more about flapping and much more important: is able to answer these questions.

This is still “just” a community forum, if you need “real” support, you will have to enquire an official support partner.

That being said, I think you got some answers to your questions already.

Is answered by the docs.

As Thomas said, the if is not necessary (would it even work?)
For all hosts/checks put the flapping parts into a “generic” host/service template and let hosts/services import the template.

Question 4 is also answered by the docs and my post.

#5 is implicitly answered by the docs. As long as no flapping is deteced you will get all notifications (problems and recovery). If flapping is deteced those will be supressed and only FlappingStart and FlappingEnd notifications are sent,

1 Like

That’s right. Maybe a developer might swoop in and answer your questions eventually.

The only way to have more detailed research done is by getting professional support at there you can pay someone to dig or to investigate with developers. Here in the community forums it’s up to community members how much they invest. And that’s up to how much spare time they have how motivated they are. I know, there are Icinga partners around here but since this is no official support channel, you wont have guarantees on the scope of the answers.

First i want to thank you both for taking your time writing me that. I’m not interested in bothering you or anything else. Sure i think there is a shifting border for “support” and “community”. I am wondering about this agrumentation because there is lot’s of software outside using “supported software” and a “community edition”. When someone has a question and has no “paid support” the ability exists to ask the “community” - what i did.

Yes there are documentations about flapping, but poorly documented. If the documentation would be clear enough i wouldn’t write here to you. Futher more the fact is that if everything would be good enough explained within the documentation a forum for asking those questions isn’t needed anymore and places like stackoverflow wouldn’t be relevant.

But anyway i’m really thankful for the information i’ve got from you @log1c and @twidhalm. I guess i willl find out how it works or if i really do need it.

One last thing…are the thresholds mentioned in the standart thresholds which will be used by actvating Flap Detection via WebUI?