Icinga2 alert metrics information, help needed

Hello,

I hope I am posting this to the right place. Please feel free to redirect me.

I am looking for a way to be able to collect data on things such as: service that alerted the most during a set time period, hosts that alerted the most during a set time period, things that flapped the most, etc etc

Does anyone know of a module, or possibly a script that reads the icinga database that could provide this information?

Hi,

I think the most people are using Grafana: https://grafana.com/
You can choose between two data sources: InfluxDB and Graphite. For this Icinga offers features to write the metrics into this. Look here: Features - Icinga 2

There also exists a module two integrate the grafana graphs in icingaweb2: GitHub - Mikesch-mp/icingaweb2-module-grafana: Grafana module for Icinga Web 2 (supports InfluxDB & Graphite)

1 Like

Hello,

There is a graphite module for Icinga that makes graphs from the metrics that each host or check returns. But as @stevie-sy said most people use grafana, it depends on what suits your situation best

Grafana/InfluxDB is something we run in our infrastructure already, and it’s useful for tracking system resources, however I’m not seeing a way to use it to get metrics about alerts themselves. Sort of the meta metrics about icinga itself and specific alerts that appear frequently.