Icinga2 distributed monitoring Master Satellites Network usage

Dear Icinga2 Community,

I have read several posts and documentation about Icinga2, like:

In a configuration Single-Master or Multi-Master (Cluster/HA) with Satellities it is possibile to have an estimate of the netwotk utillization for the communications between Master(s) and Satellities ?

Of course, the bigger the number of hosts/services to monitor the bigger the number of messages send through Icinga2 Data Exchange on port 5665…

What I want to know if it is possibile to have a schema where you can find number of events (Satellite to Master) and bandwidth usage in different scenarios, like the ones reported in “icinga2 at large scale” post:

Cluster S: up to 500 clients
Cluster M: up to 5000 clients
Cluster L: up to 10000 clients
Cluster XL: up to 30000 clients

The purpose is also to know if the Masters to/from Agensts communications has be to routed to dedicated and guaranteed network links and the impact on the network utilizzation (since it can be shared with different services).



I think this is going to be highly sensitive to the actual service checks you
need to perform.

Obviously you have communication between the Master and each Satellite, and
between each Agent and whichever Master or Satellite it is downstream from.

However, the volume of those communications is going to depend on a
combination of:

  • how many service checks each Agent is performing
  • how frequently you are performing the checks
  • the volume of data returned from each check (including performance data)

I think it would be pretty difficult to come up with “average values” for
communication bandwidth usage, given that different installations will have all
sorts of different combinations of the above.

The best you could do would be to set up one or more Agents which are
“typical” of your monitoring needs, measure the network bandwidth generated by
them (filtering by port number 5665 should be selective enough) and then using
this number, scaled by the number of Agents you want to go up to, to estimate
your total bandwidth needs.


I agree with Antony, this is highly dependent on the stuff you are checking.
Nevertheless here are some numbers from one of my systems:

HA-Master with (currently) 16 zones (each with one or two satellites) and ~4500 checks (mostly run in the satellite zones)
Interface usage on the master (non-filtered) during the last 30 days

two satellite examples (each around 80 hosts and 800 checks)

