CPU Load jumping

Good Day everyone,

we have a “problem”.
We monitor a bunch of servers in icinga, and on some of them have like a jump in load from time to time.
Now is my question: is there a way to monitor this jump?
Like:

average load = 10%
jumped load = 60%
difference = 40%
increase = 600%

Is there a way to say: give me a warn/crit if the actual load is x% higher than the average lasting x-time?

This is what is looks like:

I thought about flapping, or is this the wrong direction?

No, icinga relies on simply plugins by default and they do not provide such options. You might search for better plugins, but I’d assume this is a job for other technology e.g. Metricbeat or OTel Collector (and they required complete different stack).

Flapping is when a check exceeds a number of state changes in certain time range.

so there is no way to use a default option from icinga.

what about writing a plugin or script on our own?
In bash / python or whatever and pass informations to icinga?

Yes, that’s an option. I’d recommend a plugin that queries your performance database and return warning or critical when your threshold are exceeded.

I can recommend the Linuxfabrik’s monitoring-plugins/check-plugins/cpu-usage at main · Linuxfabrik/monitoring-plugins · GitHub as a base. It already uses some statistical information over several runs and puts them in a SQLite DB.

If you get it to work, then I would appreciate a pull request, so I can incorporate your addition in my monitoring.

You could also ask the Linuxfabrik for a quote, if you don’t want or can code it yourself.

Okay that would be for linux.
What about windows? We also have some WindowsServers we want to monitor the same.

I’m one influence, why most of the Linuxfabrik’s checks also work under Windows and are distributed as .exe :wink:

1 Like

Ah okay didnt saw that.
I will dig into that and see what i can do with it.

thank you

1 Like