Your manager comes to you, “I need a monthly or weekly report to show which servers have had issues in the last week or month. I need to see which hosts have had critical disk alerts and for how long, I need cpu stats, which ones were critical in the last month - Oh and those custom http checks, need to know how long they were critical for and for which hosts. On my desk once a week or month please?”
So you open the reporting module and realise, you don’t have what you need.
@Radius540R What exactly are you missing from the reporting module?
I know it is sparse, but what is missing specifically.
I do mostly hear random complaints about it being insufficient, but not many where people could tell me “I need this kind of data, structured in way x and presented in way y”.
lorenz, which other products would you compare icinga to on the market?
You could say that nothing is missing or you could say that everything is missing depending on what is on offer from other products. Examples: data trends, KPIs, advanced graphs, machine learning.
Could you please keep it on topic[1]? Lorenz kindly asked what kind of SLA or reporting features you would like to see and you started to ramble about trend predictions, up to ML.
I am quite certain that the current reporting module may not satisfy any needs, as outlined in your anecdotic first post. For example, having a report about the last moth’s outages would be useful in lots of compliance scenarios.
But without hearing specific demands, it’s hard to help satisfying those. Thus, please try to keep your communication direct.
Which you have set by creating this thread, btw. ↩︎
No worries, but thanks for the further clarification.
Both having some kind of “usual suspects” or outliers list might indeed come in handy. Going further, using some check’s performance data may even allow a trend prediction using simple statistics, e.g., linear growth for a steadily filling disk.
At least the top ten of trouble makers should be realizable quite easily with the already available data from Icinga DB. Same goes for new problematic hosts, having no or few records of state changes in the past.
I will try to pitch this somehow to the web team, as on the core or daemon side, everything should already be there.
For the prediction part, something supporting the perf data would be required.
However, there was one entry in your exemplary list I don’t quite get: “reporting list of hosts contacting Icinga but are not IN Icingaweb”. Are you referring to pending certificate requests on the Icinga 2 master node or signed clients missing a representing Host object?
I would also like the option to query the Perfdata writer (InfluxDB) for tactical views as the donuts is nice but I need to switch to my Grafana Icinga dashboard to know if the number of unknowns is sinking or rising.