Icinga Benchmarks

theFeu · June 22, 2020, 11:40am

Icinga Benchmarks

I’ve seen a lot of questions about benchmarks, both here in the community forums, but also from the Icinga support and developers.

This made me want to create a spot where we can collect these, which is going to be here!

There is a kind of similar topic that is in the scope of “What kind of environments are out there?”, but it doesn’t answer many questions on the performance side:

So I would like to ask for your input on how we should define some benchmarks and get some results from different environments!

Pooh · June 22, 2020, 12:06pm

Icinga Benchmarks

I’ve seen a lot of questions about benchmarks, both here in the community
forums, but also from the Icinga support and developers.

This made me want to create a spot where we can collect these, which is
going to be here!

There is a kind of similar topic that is in the scope of “What kind of
environments are out there?”, but it doesn’t answer many questions on the
performance side:

Share your Icinga Environment

Indeed - that’s all very well from an anecdotal point of view, but too much to
read for someone looking for a convenient summary / comparison.

So I would like to ask for your input on how we should define some
benchmarks and get some results from different environments!

I would say the most important aspect is that the information has to be
tabular, so that viewers can compre one entry with another easily and
quantitatively.

We probably can’t define all the parameters which are important in the table at
the start, so they’ll change a little over time, but I think the important
part is to make sure we ask people for numbers which can be compared like-
for-like to give newcomers an opportunity to see “this installation is coping
with X servers and Y services based on Z hardware, so that gives me a good
approximation to what I’ll need”.

I think it’s also important to think about how Icinga2 actually works, in
terms of Masters, Satellite and Agents, because one Master talking to 50
Satellites, each talking to 500 Agents, is going to have a very different
workload from one Master trying to talk to 25000 Agents…

I think the point I’m getting at there is that we need to know about directly
connected Agents - the indirect ones are a lot less important to system load
(although they contribute to bandwidth).

Do you have any ideas at all about how people could actually submit this type
of information, and then be able to update it as their systems change in the
future?

It feels to me like it needs to be some sort of “semi-private wiki”, where
anyone can add data to it, but only that person / organisation can edit their
data later on. Not sure how that can be done…

Anyway, those are my thoughts for the time being; once I have a bit more time
I’ll see if I can think of suggested measurements for people to supply their
values for.

Regards,

Antony.

theFeu · June 22, 2020, 12:16pm

Thanks for your input!

I already have an idea on how to implement that single-write kind of table happen - basically with a submit form in the form of a survey (we have tested out some tools for survey building anyway) and have that data be displayed in a table.
Maybe maybe I can also convince discourse to display it somehow…

I’ll brain about that for a bit, but thanks for the inspiration!

Edit: Had a chat with Blerim about it, and we might also consider doing it a little more “manually” with a read-only table document, that could either be displayed here or on some other platform (Google docs?) which we will maintain.
We shall see!

When it comes on how to build the survey question wise it’s a little more tricky…

Have a nice day,
Feu

Pooh · June 22, 2020, 2:38pm

Suggestions:

For each Master and Satellite in your network:

Is this running Grafana on a virtual machine or a physical server?
What’s the CPU spec (number of cores, GHz)?
Which Operating System are you using? 32bit or 64bit?
How much RAM does the machine have?
Are you running Icingaweb2 on the same server?
Are you running any other significant applications on the same server?
How many directly connected systems is this machine monitoring?
Of those, how many run the Icinga Agent, how many use SSH remote command
execution, how many use SNMP, how many use something else?
How many Hosts are shown in “icinga2 daemon -C”?
How many Services are shown in “icinga2 daemon -C”?
What check_interval do you use for the majority of service checks?
What is the typical 15-minute Load Average of the Grafana server?
(Need some equivalent of that for running under Windows)

Regards,

Antony.