Icinga2 Features

bagz · August 4, 2020, 10:42am

Hello everyone,

can someone please help me tick off these features that are applicable to Icinga2 in one place. The website and documentation has this scattered everywhere and i am not too sure where to find all the answers. I understand that this list is pretty broad and like i said so far i have not found a software that covers all of it, i just want to find out what is supported by Icinga. So here we go:

Monitoring via SNMP supported vendors such as Juniper Arista Mikrotik PaloAlto Cisco Aruba and others
Additional features for these vendors vis SNMP or other means: BGP states, ISIS states, Cisco IPSLA, Cisco CBQOS, mikrotik queues and Juniper QoS.
Notifications integration such as mattermost, slack, whatsapp, email, signal other?
Reporting HTML CSV PDF formats.
Reporting automated reports for SLA data, QoS reports, link utilization and others. Work hours reporting set per group of devices.
Threshold reporting and sending reports via email and/or some other channel
Hosts aliases and showing aliases in the reports if needed.
Granular permissions to the system. A client can see his/her list of devices and click through them but it can also see parts of core equipment like interface graphs, cpu graphs, specific application graphs. Then access to its maps and dashboards.
The above brings me to the Maps support? Drawing map out of LLDP data? Show status of hosts on the map and link through put live. Geolocation Maps? Maybe an ability to see outage on the map as a group of devices.
Event and event correlation. Manage events see the history of what happened to the specific event through acknowledging for example.
Classification of the events priority.
Ability to alert on specific events and escalate further to a separate group after specific time has passed.
Host grouping. Hosts can belong to multiple groups.
Some form of basic inventory data that can be monitored. Like OS version OS types.
Ability to search and export that search in execel format. What i mean by this would be an ability to show a list of all hosts of type cisco, state up/down/unknown, version of software, location filter.
Integration with some ticketing system like zammad jira otrs or any other currently supported.
Ability to send a same format alerts into the ticketing system so that tickets can be auto closed.
TopN bottomN type of dashboard and/or reporting.
Ability to monitor via DNS names and jsu straight IP and giving hosts any name needed.
Health information for devices, Power supply, temp, routing engine, fans, dBM on optics and other such stats.
Traffic stats per interface and overall per device.
REST api to add or remove hosts
Distributed Master/Slave setup with distributed pollers.
Authentication Local and LDAP integration
Suppress alerts based on a parent child relationship.
Some form of integration with external CMDB via API.
Anomaly detection and Trend predictions.
Execute external scripts when events happen. This is user defined obviously
Device discovery and device rediscovery when needed by user. Ideally a button i click to see a new graph that was configured.
Flap protection (this is a pretty big one to try and eliminate false positives as much as possible). so many monitoring systems claim to have this but it does not work.
Bulk edit of hosts. for example i want to change a template for all marked hosts, or new SNMP string or groups or enable/disable. You get the idea, things that be in common on multiple devices.
interface stats average min max and total. it would be nice if you can show a legend below and have hr ability to hide the legend.
Ability for custom development that can be paid for if needed.
Ability to monitor virtual chassis and alert on a faulty host hat belongs to a virtual chassis.
Control what is being polled on devices for example bulk polling once a day to save resources. Then do not poll any interfaces without descriptions or do not poll any interfaces for a particular type of router because they are management interfaces and others such rules.
Manage initial installation and configuration with Puppet or Ansible.

Someone · August 4, 2020, 1:10pm

Hello

Monitoring via SNMP supported vendors such as Juniper Arista Mikrotik PaloAlto Cisco Aruba and others

Additional features for these vendors vis SNMP or other means: BGP states, ISIS states, Cisco IPSLA, Cisco CBQOS, mikrotik queues and Juniper QoS.

Ability to monitor virtual chassis and alert on a faulty host hat belongs to a virtual chassis.

interface stats average min max and total. it would be nice if you can show a legend below and have hr ability to hide the legend.

Traffic stats per interface and overall per device.

Ability to monitor via DNS names and jsu straight IP and giving hosts any name needed.

Health information for devices, Power supply, temp, routing engine, fans, dBM on optics and other such stats.

Maybe, For those questions, icinga is a nagios fork, and so implement the standart nagios input/output system for checks, so plugins you could find on the nagios/icinga exchange implementing what you need should work here, i cant say however for each of thoses vendor if there is what you need, you will need to search by yourself
But most of thoses are standart KPI in monitoring, and so, are supported by long written plugins.

Notifications integration such as mattermost, slack, whatsapp, email, signal other?

Yes, the notification system can do many things, mail/mattermost/slack are supported, i’m not sure for others, you should give a look to icinga exchange :
https://exchange.icinga.com/search?q=notification

Reporting HTML CSV PDF formats.

Yes, You can natively export dashboard to PDF/CSV/JSON in icingaweb2, i guess save page directly as html should work too.

Reporting automated reports for SLA data, QoS reports, link utilization and others. Work hours reporting set per group of devices.

No, not at my knowledge, maybe throught plugins

Threshold reporting and sending reports via email and/or some other channel

No, not at my knowledge, maybe thought plugins

Hosts aliases and showing aliases in the reports if needed.

You can configure display names for hosts which differs from it’s configuration name, so i’m not sure if it’s what you mean by aliases.

Granular permissions to the system. A client can see his/her list of devices and click through them but it can also see parts of core equipment like interface graphs, cpu graphs, specific application graphs. Then access to its maps and dashboards.

Yes, you can have granular permissions for icingaweb based on various authentication sources (ldap, etc)

9)    The above brings me to the Maps support? Drawing map out of LLDP data? Show status of hosts on the map and link through put live. Geolocation Maps? Maybe an ability to see outage on the map as a group of devices.

Icinga is nagios based, so nagvis is supported, you can also use custom plugin like the openstreetmap one

10)   Event and event correlation. Manage events see the history of what happened to the specific event through acknowledging for example.

You can get the state/acknowledgement history by clicking on host/service in icingaweb, but if you need something much more event based to query it in specific ways you need, you could export it to ELK/Splunk/Logstash/Elasticsearch by hand or using already existing plugins.

Classification of the events priority.

Not sure to understand, please develop.

Ability to alert on specific events and escalate further to a separate group after specific time has passed.

Not sure to fully understand, i would say this kind of behaviour is more dependent of your ticketing system rather than the monitoring engine itself but thought notification system and custom scripting you could push your escalation where it’s needed …

Host grouping. Hosts can belong to multiple groups.

Yes

Some form of basic inventory data that can be monitored. Like OS version OS types

No, natively icinga is not meant to collect data actively to be used as inventory, but you can tag your host/services with custom variables so that it’ll be easier for your to assign checks/make conditions.

Ability to search and export that search in execel format. What i mean by this would be an ability to show a list of all hosts of type cisco, state up/down/unknown, version of software, location filter.

yes, you can export dashboard or searches from icingaweb in csv to rework it with excel if needed.

Integration with some ticketing system like zammad jira otrs or any other currently supported.

yes, jira integration is supported, not sure for others.

Ability to send a same format alerts into the ticketing system so that tickets can be auto closed.

not sure to understand, however you could plug your ticketing system to icinga thought api to acknowledge checks if needed.

TopN bottomN type of dashboard and/or reporting.

Not sure what you mean here, but icingaweb supports dashboards, and more vizualisation be added depending on your needs.
https://exchange.icinga.com/search?q=dashboard

REST api to add or remove hosts

Yes, the REST api is definitively one of the icinga strongest point.

Distributed Master/Slave setup with distributed pollers.

Yes, however, distributed pollers in a same zone are recommended to two maximum, they will share load and checks, a known bug prevents from making a zone with 10 poller in HA for example.

Authentication Local and LDAP integration

Yes for icingaweb

Suppress alerts based on a parent child relationship.

Yes, however, multiple parents dependencies can be tricky to implement by yourself since it is not natively supported.

Some form of integration with external CMDB via API.

Yes

Anomaly detection and Trend predictions.

No, not natively, you’ll need to export icinga data to another tool to make that.

Execute external scripts when events happen. This is user defined obviously.

Yes, you can trigger notification to run a script on a passive check to implement this for example.

Device discovery and device rediscovery when needed by user. Ideally a button i click to see a new graph that was configured.

It’s not natively supported at my knowledge.

Flap protection (this is a pretty big one to try and eliminate false positives as much as possible). so many monitoring systems claim to have this but it does not work.

Icinga implement flapping, more informations here :
https://icinga.com/docs/icinga2/latest/doc/08-advanced-topics/#check-flapping

Bulk edit of hosts. for example i want to change a template for all marked hosts, or new SNMP string or groups or enable/disable. You get the idea, things that be in common on multiple devices.

Yes, icinga support templates, however, you cant change them at runtime, you’ll need to restart icinga after a template editing.

Ability for custom development that can be paid for if needed.

I dont know, I’d prefer to let icinga team answer by themselves about it, however i know that icinga enjoys sponsoring for developping new features.

Control what is being polled on devices for example bulk polling once a day to save resources. Then do not poll any interfaces without descriptions or do not poll any interfaces for a particular type of router because they are management interfaces and others such rules.

I’m not sure i grasp you whole needs here, but from what i understand i can see two ways of doing this :

dedicated check that plays once a day
implement a caching system for what you want to avoid to poll in your collect script.

Manage initial installation and configuration with Puppet or Ansible.

Both are supported
https://github.com/Icinga/puppet-icinga2
https://github.com/Icinga/ansible-playbooks

I’ll give you a personnal feedback about icinga : It’s a great tool, it’s biggest advantages in my opinion are it’s ease to install, manage, maintain and integrate with most of other monitoring solutions parts (grafana, tsdb, etc) for both the frontend and backend parts, because it is based on standart protocols, methods and technologies. Also the way configuration works allows you to deploy a lot of check with few efforts needed but can still give you the opportunity to deal with specific cases when required. The downside is the scaling limited to two nodes per zone, but it’s not a problem unless you start hitting huge number of hosts (like 100k+, assuming you have decent servers). On overall, and from my experience, Icinga is a reliable tool.

Also, based on your questions, i dont think a monitoring tool implementing all of this exists, especially in open source based solutions, nowandays monitoring (especially for big infrastructures) is more about stacking different tools which are complementary to each others with custom developments required for some parts.

Beside my answers, i strongly advise you to make your own opinion and experience by grabbing an icinga image here and test it.
https://github.com/Icinga/icinga-vagrant

theFeu · August 5, 2020, 11:44am

Hello there!

Big thanks @Someone for answering the first bunch of questions!

This post looks a lot like an RFP to me though, for which the forum is not the right space, as it is meant for users of Icinga to help each other out with their setups.

If you need professional support, I would suggest contacting our sales department.
They will be happy to help you

Best of luck!
Feu

bagz · August 5, 2020, 12:13pm

That’s fine but thank you @Someone for giving me a good indication on what features are supported in a single page.