I hope I am posting this to the right place. Please feel free to redirect me.
I am looking for a way to be able to collect data on things such as: service that alerted the most during a set time period, hosts that alerted the most during a set time period, things that flapped the most, etc etc
Does anyone know of a module, or possibly a script that reads the icinga database that could provide this information?
I think the most people are using Grafana: https://grafana.com/
You can choose between two data sources: InfluxDB and Graphite. For this Icinga offers features to write the metrics into this. Look here: Features - Icinga 2
There is a graphite module for Icinga that makes graphs from the metrics that each host or check returns. But as @stevie-sy said most people use grafana, it depends on what suits your situation best
Grafana/InfluxDB is something we run in our infrastructure already, and it’s useful for tracking system resources, however I’m not seeing a way to use it to get metrics about alerts themselves. Sort of the meta metrics about icinga itself and specific alerts that appear frequently.