Best practices for sizing and self-monitoring IcingaDB

HAL · May 31, 2023, 2:02pm

Hello Icinga2 Community,

I’m in the process of migrating to IcingaDB from IDO, and I’m having trouble finding detailed best practices and recommendations for sizing the database and Icinga-Redis in the official documents.

In particular, I am seeking guidance on:

Sizing the database and Icinga-Redis: Is there any rule of thumb or guidelines to follow for estimating the required RAM, CPU, and IOPS, especially considering the scale of monitoring required in a multi-master setup?
Self-monitoring: What metrics should be monitored regarding IcingaDB and Icinga-Redis in a multi-master setup to ensure optimal performance and early problem detection?

Any insights or pointers to relevant resources would be greatly appreciated.

Thank you in advance for your assistance.

Best regards,
Dennis

ShowMeYourSkil · May 31, 2023, 5:53pm

Hello Dennis,

There is no rule of thumb here.

I use the following sizing for my master-master environment and for 8 satellites:

Icinga Master-Cluster: 4x vCPU, 8GB RAM, 40GB HDD
Icinga Database Server: 4x vCPU, 8GB RAM, 50GB HDD
Icinga Grafana Server: 4x vCPU, 4GB Ram, 100GB HDD
Icinga Satellites: 2x vCPU, 4GB Ram, 20GB HDD

It is important to check the number of hosts and services. My environment has 338 hosts and 2832 services. I hope this information was of some help to you.

Best regards
David

HAL · June 1, 2023, 9:01am

Hello David,

Thank you for your response. Unfortunately, I’m working with a significantly larger setup, making it not directly comparable. My database runs on a 3-node Galera cluster with 4 CPUs and 24GB RAM each. Both the masters and satellites are equipped with 16GB RAM and 16 CPUs each.

The IDO Database currently stands at approximately 5.5GB in size with a 1-year retention policy applied. A configuration reload takes about 2 minutes and 30 seconds. I’m not at liberty to disclose the number of hosts and services monitored by this setup. I just want to avoid unpleasant surprises, because I cannot reproduce the actual load on the development and staging environment.

Best regards,
Dennis

ShowMeYourSkil · June 3, 2023, 12:29pm

Hello Dennis,

I understand. After my upgrade from IDO to Icinga DB, I did not change anything in my hardware setup.

@log1c has already made a list of this. Maybe this information will help Advice for hardware requirement - #2 by aflatto

Best regards
David

matthew.smith · June 6, 2023, 1:04am

Server sizing will depend on the size of your setup so I’m not going to suggest specifics. Besides the size of servers though some of the things you want to consider if you are planning to have 1,000’s or 10,000’s hosts and services are

keeping things separate helps, Icinga masters, database and IcingaWeb2 all on their own hosts.
housekeeping should be setup from the start for all components (Icinga DB Housekeeping)
don’t run checks on the master, if that means you create a dedicated satellite to run checks do that
be careful of how you use apply rules and/or automation, you should use them but try to break things up a small amount. This is so when you change something it only results on part of your configuration changing rather than all of it. The reload bomb is real.

To expand on the last point, if you import hosts with director automation it is better to have multiple small import sources and sync rules rather than a single large import source and sync rule.

This is because a change to the single import source may result in all your hosts changing which is a large reload task for the masters. Multiple small import sources allow you to make more target changes or spread out over multiple deployments. It also helps contain human error in sync rules as the fallout is smaller
examples of breaking up import rules might be AWS hosts, VMware Hosts, Physical Hosts or by department. The best way to do this depends on your needs.