IcingaDB Redis continuously growing on secondary-master

dangelovich_basis · July 9, 2025, 5:18pm

My setup has a primary and secondary in the Master zone, and then multiple other zones each with two satellites. Each master has its own independent Redis instance, icingadb instance, and they share a backend database (mysql).

Both masters are running checks, and the system seems fine. I got an alert about memory usage on my secondary master, and started looking into it - yesterday Redis RSS was at 1.4GB, today at 1.8GB.

When I check various Redis keys/streams, the secondary’s are much larger than the primary:

The memory usage is also vastly different:

Key	Bytes (Secondary Master)	Bytes (Primary Master)
icinga:checkcommand	152777	141033
icinga:checkcommand:argument	1133889	1178600
icinga:checkcommand:customvar	376880	376880
icinga:checkcommand:envvar	1923	1923
icinga:checksum:checkcommand	41032	41032
icinga:checksum:checkcommand:argument	507960	507960
icinga:checksum:checkcommand:envvar	720	720
icinga:checksum:comment	296	296
icinga:checksum:downtime	18360	22848
icinga:checksum:endpoint	204192	204192
icinga:checksum:host	221728	221728
icinga:checksum:host:state	221736	221736
icinga:checksum:hostgroup	3144	3144
icinga:checksum:notification	7874264	7874264
icinga:checksum:notificationcommand	720	720
icinga:checksum:notificationcommand:envvar	7256	7256
icinga:checksum:service	8079760	8079760
icinga:checksum:service:state	8079760	8079760
icinga:checksum:servicegroup	1864	1864
icinga:checksum:timeperiod	968	968
icinga:checksum:user	3136	3136
icinga:checksum:usergroup	712	712
icinga:checksum:zone	203368	203368
icinga:comment	1600	1600
icinga:customvar	1983880	1983880
icinga:downtime	114313	149304
icinga:dump	9824	4736
icinga:endpoint	557464	557464
icinga:history:stream:acknowledgement	412	412
icinga:history:stream:downtime	396	396
icinga:history:stream:flapping	396	396
icinga:history:stream:notification	404	396
icinga:history:stream:state	396	404
icinga:host	1477208	396
icinga:host:customvar	1917616	1477208
icinga:host:state	1670368	1917616
icinga:hostgroup	12136	1747628
icinga:hostgroup:member	1054312	12136
icinga:nextupdate:host	191028	1054312
icinga:nextupdate:service	6939187	198272
icinga:notes:url	1008	6592480
icinga:notification	45920528	1008
icinga:notification:customvar	31248	45920528
icinga:notification:recipient	154858456	31248
icinga:notification:user	276544	154858456
icinga:notification:usergroup	14785688	276544
icinga:notificationcommand	2928	14785688
icinga:notificationcommand:envvar	20494	2928
icinga:runtime	600468848	21364
icinga:runtime:state	946190028	380
icinga:schema	1024	388
icinga:service	58567219	4736
icinga:service:customvar	64569712	55722854
icinga:service:state	68522504	64569712
icinga:servicegroup	7432	81322145
icinga:servicegroup:member	1055632	8123
icinga:stats	10880	1055632
icinga:timeperiod	3627	10880
icinga:timeperiod:range	11192	3524
icinga:user	15448	11192
icinga:usergroup	2536	15819
icinga:usergroup:member	5264	2536
icinga:zone	643040	5264
icingadb:overdue:service	14200	643040
icingadb:telemetry:heartbeat	4752	4752
icingadb:telemetry:stats	24088	37424

I’m wondering if theres an issue here, or the secondary is just holding more data than the primary for “reasons”, or if its somehow misbalanced and the secondary is just doing more work.
I’d have expected that once things get written out to the DB, it would remove them from Redis.

Can anyone advise on the reason for the difference?

Icinga DB Web version (System - About): 1.1.3
Icinga DB Redis version: 7.2.6
Icinga Web 2 version (System - About): 2.12.2
Web browser: Safari
Icinga 2 version (icinga2 --version): r2.14.3-1
Icinga DB version (icingadb --version): v1.2.0
PHP version used (php --version): 8.2.28
Server operating system and version: Debian 12

apenning · July 10, 2025, 7:33am

Welcome to the Icinga Community and thanks for coming forward with your issue in such a detailed way.

First thing first, the versions of your installed Icinga components are quite outdated. There are also some security updates missing, for example for Icinga 2 (not really relevant for Debian 12, though) and Redis. Furthermore, there were quite some changes in Icinga DB between 1.2.0 and 1.4.0.

How does the memory usage of the Redis develops over time? Do you have more data points, for example, from a perfdata writer? And how is the memory consumption of the primary master in comparison?

Looking at your table, lots of values or equal or quite similar, but there are a few exceptions outlining in both directions, e.g., icinga:host (1,477,208 vs 396) or icinga:hostgroup (12,136 vs 1,747,628).

More interesting, do you have the icingadb check command defined for each Icinga DB node? This check reports lots of interesting performance data metrics, like icinga2_redis_query_backlog or icingadb_history_backlog, icingadb_runtime_update_backlog. Is any of those values greater zero and how do they variate over time? Again, a perfdata writer might be useful.

Another thing to consider is Redis default behavior regarding perpetually dumping its state. Please consider the “Huge memory footprint and IO usage in large setups” section in the Operations manual and check if this applies to your setup.