Icinga2 consumes more than 60GB (see attached) of memory and eventually fails with exit code 139.
- Version: 2.13.1-1
- Operating System and version: CentOS Linux release 7.7.1908 (Core). It is running in docker.
- Enabled features: api checker gelf ido-mysql mainlog
- Icinga Web 2 version: 2.7.3
- Config validation:
[2021-10-25 07:46:35 +0000] information/cli: Icinga application loader (version: 2.13.1-1)
[2021-10-25 07:46:35 +0000] information/cli: Loading configuration file(s).
[2021-10-25 07:46:38 +0000] warning/config: Ignoring directory '/var/lib/icinga2/api/zones/***-Satellite' for unknown zone '***-Satellite'.
[2021-10-25 07:46:38 +0000] warning/config: Ignoring directory '/var/lib/icinga2/api/zones/paas-***-training' for unknown zone 'paas-***-training'.
[2021-10-25 07:46:38 +0000] information/ConfigItem: Committing config item(s).
[2021-10-25 07:46:38 +0000] information/ApiListener: My API identity: icinga2
[2021-10-25 07:46:39 +0000] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 11:1-11:45) for type 'Notification' does not match anywhere!
[2021-10-25 07:46:39 +0000] warning/ApplyRule: Apply rule 'mail-icingaadmin' (in /etc/icinga2/conf.d/notifications.conf: 23:1-23:48) for type 'Notification' does not match anywhere!
[2021-10-25 07:46:39 +0000] warning/ApplyRule: Apply rule 'backup-downtime' (in /etc/icinga2/conf.d/downtimes.conf: 5:1-5:52) for type 'ScheduledDowntime' does not match anywhere!
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 GelfWriter.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 User.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 UserGroup.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 3 ServiceGroups.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 3 TimePeriods.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 46 Zones.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 2 NotificationCommands.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 2 HostGroups.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 55 Endpoints.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 FileLogger.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 ApiUser.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 282 CheckCommands.
[2021-10-25 07:46:39 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2021-10-25 07:46:39 +0000] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-10-25 07:46:39 +0000] information/cli: Finished validating the configuration file(s).
The node is a master node. There are 45 zones and 55 endpoints in the cluster. No zone has more than 2 endpoints. Icinga and IDO checks report OK. It doesn’t use much CPU.
There are ~120k checks deployed to the cluster, some of them are active some are passive. Most of the checks are running on satellites.
Icinga2 doesn’t log any errors before the failure.
I believe that 60GB is too much even for 120k checks. Any ideas what can cause so high memory consumption and what we can do to fix it?
