Number of devices monitored on Icinga

Hello. I am new on this forum and using this tool.
I have been searching on the release notes, but really can´t find the sizing of the tool and the number of devices that can monitor just a maser or satellite Icinga.
We are going to start a proyect and we have to tell final customer how many servers do we need for monitoring their network. It would be about 5.000 devices.

Please if could you tell me where can be the problem I would really appreciate that.

Regards

You need more information than just “5000 devices”.

For example, are these “devices” servers which can run the Icinga Agent
locally, or are they routers, switches and similar things which need another
machine to get data from them by, for example, SNMP?

Also, how many service checks do you need to run on each device? I mean
things like disk space, process numbers, load average, network traffic…

Finally, how frequently do you need updates of the data (how long can you
accept between service checks, when you won’t know there’s a problem if one is
just starting)?

Oh, and where are these “devices”? All on a local high-speed network, or
widely distributed across the world on some fast, some slow, network links, or
in a few good-bandwidth data centres?

All of those things will play a role in determining:

a) what architecture you need to set up (one or two masters, zero or more
satellites)

b) what sort of spec you need for the machines involved.

Regards,

Antony.

2 Likes

Hello Antony, thanks for answering so soon!

We are planning to monitor Cisco UCS devices, Linux Servers, Windows Servers and VMWare.

We would like to implement trap monitoring and check CPU, Memory, Interface Status,proccesses and things like that. We would like too to receive traps to create alarms with it.

The devices will be implemented on different cities arount the world and final customer probably will have lines to communicate different places.

Hi, that is hard to tell. The main performanceimpact will come from the check you will use. If you use check_nwc_health. f.e. i would think you need at least 3-4 Servers with 4 cores and 4-8 GB ram each, as the check is very ressourcehungry. I tried to figure out how to fix that, because server load is not that high, but i get a lot of “D” state processes, followed by timeouts, especcially when Icinga is reloading.

With other checks like check_snmp you should get much more checks per host.

It also depends on your check_interval.

I strongly recommend using satellites on different locations. In that case you can load balance your checks and you will not get so much unknown results if the connection is bad. As snmp is udp you will get problems with packetlosts otherwise. You should not have a problem with one master if you devide your checks on the satellites per location.

1 Like

As a starting point for the architecture, I would suggest:

a) one Satellite per major location (your “different cities around the world” -
I hope you mean all the machines in Toronto are in one data centre etc)

b) Linux and Windows can run the Icinga Agent, reporting to their local
Satellite, which then reports to the Master/s

c) you’ll need one local Satellite per location to monitor any SNMP etc
devices (this might be quite a low-powered machine, though, depending on the
number of devices at the location). The same machines as all the Linux /
Windows servers report into is fine.

d) all the Satellites then connect to a single Master (or an HA pair if you
want that feature), which is where you host Icingaweb2 for visualisation of
the information.

e) If you have any really tiny outposts, with just two or three things which
can’t run the Icinga Agent, they could be monitored from the Master or any
nearby (to the outpost) Satellite.

Finally, remember to keep all the connections between Satellites and Master/s
secure :slight_smile:

Regards,

Antony.

3 Likes

Thanks for all the answer. The installation is going to be done in 17 cities. So I would have to install 17 Satellites?
Regards

you do not need to, but u can. It depends on your connection and stability.

edit: and load.

I would install a Satellite if any of the following are true:

a) the connection to that location (from the Master) is unreliable, and
service check results may be lost during times of no connectivity (the
Satellite will store them locally and then relay on to the Master when the
connection comes back)

b) you have enough machines, multiplied by the number of service checks, at
that location, that it’s worth offloading the service check management from the
Master to a Satellite. I can’t tell you what “enough” is, because we don’t
know what sort of service checks you’re doing or how often you want them to be
performed.

c) you have enough devices at a location which cannot run the Icinga Agent and
need to be polled or collated by SNMP etc. Again, I can’t say what “enough”
is unless we know how many SNMP checks you’re trying to do and how frequently.

d) local management want to have their own view of the service checks for just
that location (which you can do by installing Icingaweb2 on the Satellite).

As I previously said: If you have any really tiny outposts, with just two or
three things which can’t run the Icinga Agent, they could be monitored from
the Master or any nearby (to the outpost) Satellite.

Regards,

Antony.

2 Likes