Massive writes to icinga database

gabbe · December 13, 2019, 2:11pm

Hi,
this might be a MariaDB issue, but I am hoping to get some troubleshooting help here… I am using an external database (MariaDB 10.3 on Debian 10) with icinga2 r2.11.2 in a docker container. My problem is that icinga triggers massive writes to the database: 10~30 Mb/s continuously. This will wear down the server’s SSD in just a few months…

The only thing using this database server at the moment is icinga, and when I pause the container, disk writes drops to normal levels on the database server.

I disabled ALL checks (moved the config files) with icinga, but even with no (zero) hosts configured icinga still triggers disk writes at about 10 Mb/s on the database server.

I enabled logging on the MariaDB-server and the log grows as expected, with a few kB every ten seconds, nothing major. I see no suspicious sql statements in the logs, but apparently updating the programstatus table makes the mariadb-server go bananas.

This can’t possibly be normal? Does anyone have an idea what might be the problem here or how to troubleshoot this issue?

parse_flower · May 3, 2020, 9:03pm

Hi,

I have a not-so-busy network and use icinga to find problems quickly.

However, I see this heavy write traffic. I tracked it down to icinga. So I reduced the check frequency as much as practical, the intervals are on the order of a few minutes. This could still cause a lot of traffic if it is spread out in time. It seems to be 100kb/s … 1Mb/s.

But the single source of most traffic must be updating the programstatus table, every 10 seconds. I can watch it in phpmyadmin. enable_ha = false in IdoMysqlConnection.

I made a separate mariadb (mysql) server with special configuration, including

binlog_format = STATEMENT
innodb-flush-neighbors = 2
innodb_idle_flush_pct = 2
innodb_flush_method = nosync
#innodb_flush_method = littlesync
innodb-doublewrite = OFF
innodb_flush_sync = OFF
innodb-io-capacity = 100
innodb-use-native-aio = OFF
innodb_flush_log_at_trx_commit = 0
innodb_flush_log_at_timeout = 2700
sync_binlog = 0
log_bin = OFF
binlog-ignore-db=icinga_ido
binlog-checksum=NONE

This looks like a lot of overkill, and nothing of this seems to reduce the traffic at all. Notably, any “nosync”, sync=0, sync=OFF seems have no effect.

mariadb has an engine MEMORY that, as per documentation, cannot be used for the programstatus table (has TEXT columns).

I cannot find any way to reduce of these “heartbeat” writes in icinga. The only obvious way would be to turn off all database writing (remove IdoMysqlConnection), and icingaweb, altogether.

So I believe (haven’t read the code) that icinga 2 just does this. 1 or 2 database writes every 10 seconds just to prove it is alive.

I guess it is no big deal on a busy server, and icinga is made for these environments. But someone watching a few Smart Home devices will want to reduce this.

Is there anything that can be configured in icinga, to reduce this? or in mysql? Can a switch to postgresql help (can postgresql keep this away from the disk)?

To change this in icinga, my idea would be to try the MEMORY engine for this table. This would require changes related to the event_handler fields …

eternalliving · January 23, 2022, 7:34pm

Did anyone find a solution for the heavy write traffic? I’m running Icinga on a RPi and it’s writing ~25GB/day… that’s close to my whole SD card worth and I’m concerned for the longevity it.

Has anyone tried using a RAM disk to store the databases? That was my only other idea to offload it from the SD card at least. But am hoping to find another solution.

gabbe · January 24, 2022, 6:56am

Sorry, I never found a solution and actually ended up scrapping my Icinga installation since it was killing my SSD:s. I have thought of trying it again but haven’t gotten around to it.

awnz · April 7, 2022, 10:28am

I came here looking for a solution to this too - my Icinga container in Proxmox is writing at three times the sustained write rate of all other worloads combined(!) at around 20iops / 250kB per second (900MB per hour, 21GB a day!) which is going to kill my SSD in a lot less time than without Incinga. This is only monitoring a small handful of containers and a VM.

I may have to turn it off until I can find a workable solution

awnz · April 7, 2022, 11:21am

In the end, came up with my own solution (homelab quality, not at all production quality, involving tmpfs) - see post at Heavy Disk Usage (I/O) with MariaDB - #14 by awnz