This is my first post here so hopefully I include all the necessary data to help get an answer.
I have Icinga 2 (2.13.2-1) running on a RPi with Buster.
I have the follow enabled features: api checker command ido-mysql influxdb mainlog notification
I have IcingaWeb 2.6.2 with only director as the extra module (and it’s dependants)
I have ~140 checks per 15min but the idomysqlconnection_ido-mysql_queries_15mins are nearly at 3,000.
Because I run this on a RPi and it’s install on an SD card I was concerned about the amount of I/O writes it does to the DB (~25G/day… nearly the whole SD card). I haven’t been able to figure out why such a large amount of writes to the DB or why so many more queries then checks.
ibdata1 is at 72MB and the two ib_logdata0/1 files are capped at 48MB each. But iotop shows a huge amount of data being constantly written with mysql.
When I stop icinga, mysql reduces to a ‘normal’ i/o load ~200K-500K/min instead of ~5-10M/min.
I guess my question is, is this normal performance for icinga? If so, how can I keep the queries in ram for longer and only write to disk less frequently. Icinga performs fine on the RPi and the load is minimal, it is just the ~25/GB a day writes that has me concerned about the longevity of my SD card.
It may be helpful if you can provide the exact commands you are running and the direct output of them when you are checking on these stats. It is usually best to remove interpretation from the equation
Thanks for the tips… Here are the commands and their outputs
/var/lib/mysql $ ls -lh ib*
-rw-rw---- 1 mysql mysql 48M Jan 22 17:44 ib_logfile0
-rw-rw---- 1 mysql mysql 48M Jan 22 13:51 ib_logfile1
-rw-rw---- 1 mysql mysql 76M Jan 22 17:44 ibdata1
MariaDB [(none)]> show global variables like 'innodb_file_per_table';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| innodb_file_per_table | ON |
+-----------------------+-------+
Yea I used general_log to log all the queries. Should I attach that file somewhere?
That would be great if you could provide the general log file for some period of time. I’m not sure how much sensitive data may be there, so you may want to review it first.
It may also help to narrow down which tables are being written to by their growth in size (if any). I have to assume it is not growing by 25 GB a day, or you would have run out of space by now.
Maybe this shows something interesting? (Adjust as necessary):
for i in 1 2 3 4 5; do date; ls -lS /var/lib/mysql/icinga | head -5; sleep 60; done
I’ve attached the log file from general_log. It ran for about 20mins. I did my best to take out sensitive data. Raspy.log (792.8 KB)
The database isn’t growing very much at all, what I would expect, maybe 1MB a day… thankfully!!
I ran the following as you suggested:
for i in 1 2 3 4 5; do date; ls -lS /var/lib/mysql/icinga | head -5; sleep 60; done
Sun Jan 23 15:07:52 PST 2022
total 33172
-rw-rw---- 1 mysql mysql 10485760 Jan 23 07:48 icinga_statehistory.ibd
-rw-rw---- 1 mysql mysql 491520 Jan 23 15:06 icinga_customvariables.ibd
-rw-rw---- 1 mysql mysql 475136 Jan 23 10:00 icinga_notifications.ibd
-rw-rw---- 1 mysql mysql 458752 Jan 23 15:06 icinga_customvariablestatus.ibd
Every repeat was exactly the same, size and datestamp
Using ls -lt however produced the following:
total 33172
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:11 icinga_programstatus.ibd
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:11 icinga_runtimevariables.ibd
-rw-rw---- 1 mysql mysql 442368 Jan 23 15:11 icinga_servicestatus.ibd
-rw-rw---- 1 mysql mysql 393216 Jan 23 15:09 icinga_hoststatus.ibd
Sun Jan 23 15:12:45 PST 2022
total 33172
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:12 icinga_programstatus.ibd
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:12 icinga_runtimevariables.ibd
-rw-rw---- 1 mysql mysql 442368 Jan 23 15:12 icinga_servicestatus.ibd
-rw-rw---- 1 mysql mysql 393216 Jan 23 15:09 icinga_hoststatus.ibd
Sun Jan 23 15:13:45 PST 2022
total 33172
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:13 icinga_runtimevariables.ibd
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:13 icinga_programstatus.ibd
-rw-rw---- 1 mysql mysql 442368 Jan 23 15:13 icinga_servicestatus.ibd
-rw-rw---- 1 mysql mysql 393216 Jan 23 15:09 icinga_hoststatus.ibd
Sun Jan 23 15:14:45 PST 2022
total 33172
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:14 icinga_runtimevariables.ibd
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:14 icinga_programstatus.ibd
-rw-rw---- 1 mysql mysql 393216 Jan 23 15:14 icinga_hoststatus.ibd
-rw-rw---- 1 mysql mysql 442368 Jan 23 15:14 icinga_servicestatus.ibd
Sun Jan 23 15:15:45 PST 2022
total 33172
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:15 icinga_programstatus.ibd
-rw-rw---- 1 mysql mysql 114688 Jan 23 15:15 icinga_runtimevariables.ibd
-rw-rw---- 1 mysql mysql 442368 Jan 23 15:15 icinga_servicestatus.ibd
-rw-rw---- 1 mysql mysql 393216 Jan 23 15:14 icinga_hoststatus.ibd
Seems like these are also being rewritten constantly, although I didn’t see them while watching fatrace.
I have temporarily disabled ido-mysql feature and that has removed the heavy I/O usage. It is still notifying me when there are problems and influxdb is still working collecting data. I’m still able to use Icingaweb2 for some tasks (I haven’t really checked the full scope of what I can and cannot do with ido-mysql disabled).
Would love to still use ido-mysql if I can get the writes down to a reasonable amount.
Thanks for providing that! The frequency of COMMIT is strange to me, and could be why the modified times keep getting updated and why the write amount being reported is so large. I don’t see that - maybe a AUTOCOMMIT difference?
I can say my programstatus and runtimevariables tables are not written to nearly as frequently as yours seem to be (hours). The service/host status are - but that’s to be expected.
Are you using Multiple Masters?. Can you check the contents of icinga_programstatus to see if it has more than one row?
Something odd may be going on with runtimevariables as well, maybe check out:
Is it normal to have ~20 queries per service check? Icinga reports 133 active service checks for 15min and ~2600 ido-mysql_queries for 15min. These numbers are observed in the check_icinga perf data.
It’s a strange one… I’ve looked at ways to simply move the database to ram using buffers or other mariadb options, but nothing seems to actually help reduce the reported writes by both iotop and iostat.
No problem! I wish I had a simple answer, and thanks for providing this info…maybe someone else will use it to think of something I haven’t.
I have much lower queries-to-checks ratio than you are seeing. I will say I’m not using influxdb…maybe could you disable that and re-enable ido-mysql briefly?
Yea I’m not sure what is going on, but feel it may be an issue with MariaDB. After doing a lot of reading it seems others are having similar issues with large writes to the HDD.
I disabled ido-mysql feature, but still had icinga-director as a service running and found it was producing the same large write results. I’ve disabled icinga-director and disabled ido-mysql and it has removed the heavy write issue. In comparison, influxdb has used about 50Mb of writes where ido-mysql would be up to about 10GB.
I did disabled influx and run ido solely, but that didn’t change anything.
How about using the icingadb? I’ve heard mixed things about it, does it work in with icingaweb2 and how would I go about that?
It’s certainly not an issue with MariaDB as it will not run queries on its own against the Icinga2 database. I have not seen this myself, so I am trying to learn more about it to troubleshoot it. The measurement of writes (10 GB) is off to me, so I’m not sure that’s useful. I haven’t used icingadb.
I just wonder if mariadb is rewriting major parts of the table file when a line is deleted or updated? Watching iotop there are moments where it jumps 4-5MB in the accumulated write field.
The measurement of 10GB is calculating what iotop -aoP is reporting for mysql with a couple of hours of monitoring it.
My temp solution… move mariadb to tmpfs and rsync it once a day or so.
Works in a LXC container on Proxmox.
Obviously this method carries its risks, and is not suitable for a production environment unless you really know what you are doing and have daily backups in case the database “save” rsync is inconsistent/corrupt.