Hi,
I’m having some issues with my Icinga setup.
So. We have one master and 3 satellites. The master is just getting and displaying the check results (the only thing it’s checking is itself).
Almost everytime a deployment occurs, icingadb seems to be failing for 5/10 minutes. During that time, the IO Delay on the host server is increased quite a bit (the master and satellites are virtual machines under proxmox).
If I look into the logs, I can see this :
Logs from mariadb
2024-11-14 9:13:27 426426 [Warning] Aborted connection 426426 to db: ‘icingadb’ user: ‘icingadb’ host: ‘localhost’ (Got an error reading communication packets)
2024-11-14 9:13:27 426440 [Warning] Aborted connection 426440 to db: ‘icingadb’ user: ‘icingadb’ host: ‘localhost’ (Got an error reading communication packets)
2024-11-14 9:13:27 426444 [Warning] Aborted connection 426444 to db: ‘icingadb’ user: ‘icingadb’ host: ‘localhost’ (Got an error reading communication packets)
Logs from icingadb
2024-11-14T09:12:57.771306+01:00 HOST icingadb[2497003]: heartbeat: Previous heartbeat not read from channel
2024-11-14T09:13:00.771534+01:00 HOST icingadb[2497003]: heartbeat: Previous heartbeat not read from channel
2024-11-14T09:13:03.771369+01:00 HOST icingadb[2497003]: heartbeat: Previous heartbeat not read from channel
- Version used :
2.14.3-1 - Operating System and version :
Debian 11 - Enabled features (
icinga2 feature list
) :
Enabled features: api checker icingadb influxdb2 mainlog notification - Icinga Web 2 version and modules (System - About)
customdashboards - director - doc - grafana - icingadb - incubator - itop - Config validation (
icinga2 daemon -C
)
[2024-11-14 10:27:17 +0100] information/cli: Icinga application loader (version: r2.14.3-1)
[2024-11-14 10:27:17 +0100] information/cli: Loading configuration file(s).
[2024-11-14 10:27:19 +0100] information/ConfigItem: Committing config item(s).
[2024-11-14 10:27:19 +0100] information/ApiListener: My API identity: icinga-master-p-01.srv-gsi.brgm.recia.net
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 5 NotificationCommands.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 198 HostGroups.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 13708 Hosts.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 4 Downtimes.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 Influxdb2Writer.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 13964 Dependencies.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 34 Comments.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 IcingaDB.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 6 Zones.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 User.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 4 Endpoints.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 3 ApiUsers.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 256 CheckCommands.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 1 UserGroup.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 2 ServiceGroups.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 2 TimePeriods.
[2024-11-14 10:27:27 +0100] information/ConfigItem: Instantiated 37002 Services.
[2024-11-14 10:27:27 +0100] information/ScriptGlobal: Dumping variables to file ‘/var/cache/icinga2/icinga2.vars’
[2024-11-14 10:27:28 +0100] information/cli: Finished validating the configuration file(s).
The master has 16 CPUs and 16 GiBs of RAM and doesn’t seem to struggle too much but the mariadb service does take quite a bit of ressources with many subprocesses; I wonder if that’s normal.
Here’s what I configured on the mariadb server :
[mysqld]
innodb_buffer_pool_size = 7000M
max_connections = 1000
innodb_log_buffer_size = 64M
key_buffer_size = 256M
Anybody has an idea about this ?