Best Practice: How many systems should a master monitor?

LiveNorm · January 26, 2023, 10:45am

Hello dear community,

I would like to set up an Icinga monitoring system in my company and have therefore been working intensively on the configuration of a system and general best practices for a few days. I think I have a good understanding of what masters, satellites and agents are and how to set up a system with high availability.
Unfortunately, the otherwise very good documentation could not answer one question for me: How many systems should a master monitor, or at what point does it make sense to outsource monitoring to several satellites and then only have these monitored by the master? Is there a practical limit to how much the master should take over or can a master monitor an almost infinite number of systems as long as enough resources are available for the master?

In the end, several thousand systems should be monitored at one location.

Kind regards
Alex

Pooh · January 26, 2023, 11:24am

“It depends”

You talk about monitoring several thousand systems, but equally important are
how many services you are checking on those systems, and how frequently you
are performing checks.

You will greatly reduce the load on the master by using top-down config sync
instead of command endpoint.

Beyond that, the best you can do is to measure the CPU load and network
bandwidth on your master machine to see how well it is coping with your own
combination of X machines being monitored for Y services at intervals of Z
minutes.

I don’t think anyone here can give you a formula for calculating either what
performance machine you need to support a given infrastructure, or how much
monitoring a machine of a given spec can manage.

Antony.

log1c · January 26, 2023, 12:11pm

This is something that is difficult to answer, because it mostly depends on how resource hungry the checks are and how often they run.

You can take a look here: Icinga2 at large scale

In general I would recommend running the database on a separate node, in case of the IDO db.
With the new IcingaDB (Redis+Mysql/Pgsql) the load due to db queries should decrease.