I just wanted to share a small script I developed today.
As you guys surely can imagine, monitored systems come and go, and those that leave usually don’t return.
But the data of these systems is still being kept in various places, despite not being useful to anyone.
As we have a quite high fluctuation within the monitored systems, I searched for a way to clean up at least some of the data these no-longer present systems which is why I developed this script to cleanup my InfluxDB of orphaned hosts.
I was able to reclaim several GB using that script in our environment (~2000 hosts, ~10000 services).
This script does the following:
- collect list of hostnames present in Icinga2
- collect the measurements in Icinga2’s InfluxDB
- collect the hostnames within a measurement
- check if each hostname is still present in Icinga2, if not, drop ALL series for that host
Obviously run this at your own risk and create backups if your InfluxDB contains critical data!
Also, please note that this script causes a lot of IO, depending on how many series have to be dropped.
Please see the following GitHub repo for this script:
I might implement the following additional functionality:
- cleanup orphaned services where the hosts still exist in Icinga2
- provide some sort of retention mechanism, where only series with their last datapoint more than X days in the past are being dropped
- switch to InfluxDB-Python instead of calling the influx binary to improve speed
May this script might be useful for someone else