Share your Icinga Environment

environment
icinga
(Michael Friedrich) #1

Dear all,

we use Icinga and its integrations in many different ways, and we love to hear more about it in your Icinga Camp/Meetup talks. Since we as the people behind Icinga also know that not everyone can do so, we’d like to start a challenge on here.

Share your Icinga environment

Include everything you think is worth sharing, by replying to this topic. Here’s a few pointers on what’s the basics you can add. Be creative, the list isn’t complete nor mandatory.

  • What criteria did you weigh up in deciding to engage Icinga?

  • How many hosts/services do you monitor? (Hint: icinga2 daemon -C)

  • Describe your Icinga Setup roughly (High Available, Satellites, Agents etc.)

  • How do you add new monitoring objects? Describe your configuration workflow roughly.

  • Name your most favourite Icinga Module

  • How do you integrate Icinga with your other Tools?

  • What’s that ONE piece your missing in Icinga?

  • Spice it up, attach some screenshots from your Tactical Overview

Is there anything else you would like to share? Don’t hesitate, add everything you think is interesting!

Rewards

Share your Icinga Environment! Reward for the first 3 Icinga users taking part: One fancy blue Icinga drink bottle and a free ticket to Icinga Camp Berlin!

Until when?

Take your time with collecting all the details and screenshots, but as always - add them as as soon as possible. To give everyone the chance to step in, like and share, the first round targets the end of March, 2019.

Thanks for sharing!

Cheers,
Michael

6 Likes
(Michael Friedrich) pinned #2
(George K) #3

Hi all,

Since people seem shy to post their Icinga setup, I will to post our small environment and inspire others to do the same.

First of all, we are a publicly funded institution, so our budget is tight. We tried both NewRelic and Datadog, which are fine in what they do but are too pricy for us plus we had some GDPR related issues. Some in the team had previous experience with Nagios and OP5, but we opted for Icinga because we wanted to pair it with our Graylog servers and have an overview of what is happening in our systems plus the interface seemed good enough.

Our environment is small, 124 servers and 1010 services are monitored. We have no need for HA and Satellites for the moment, so a single VM with Icinga and two containers with Grafana and influxDB. We gather most information through network plugins and SSH for load, memory, etc for Linux servers and WMI for windows servers.

We don’t use Director either. We wanted to keep it as simple as possible and went with plaintext files. We manage them with a combination of ansible playbooks mostly for adding/removing new servers and some manual work for adjusting services and variables.

Favourite icinga module is the Grafana module by Carsten Kobke. We love Grafana and praise Carsten daily for the plugin.

As for integrations, we use the graylog plugin to check our Graylog streams and we have written a custom plugin to integrate with Statuscake which checks our services externally. Previously it was also reporting issues to HipChat, but since we moved to Slack we stopped sending messages there due limits in the free version. We are also using dashing interface in a big screen, so we know quite fast when things break.

We would really like to have a proper reporting functionality for all sorts of cases but for the IT team a better way to handle notifications (maybe NoMa2 ?) is our number one request now.

Finally, a big thank you to all who have worked hard to make Icinga a great project, a top-level monitoring tool and fantastic community.

/gk

4 Likes
#4

Hello all,

I just found out this, and I’d like to share our “not so small” platform.

We reviewed several monitoring tools a while ago and we decided to migrate from OpenNMS (our old platform) to Icinga because of the integration with other tools and the ease (but powerful) of configuration on Icinga.

As you can see on the screenshot, we’ve got ~5900 hosts in total and ~79k services. Sadly for us, it’s not all green.

Our infrastructure has around 35 satellites (internal and external ones) and 1 master node. We do most of our checks through SNMP, but we also use our own scripts to run some checks (using python, bash, and even Docker containers for specific services). We also have 1 node with Grafana and other with InfluxDB

At the moment, we don’t use Director, so our configuration is done directly on the files. Since there is a big group of people working on the configuration, we manage it through our SCM and pull requests to avoid issues.

Like @gkoutsog, I would say that Grafana module is the favourite for us.

Regarding to integration, we manage Icinga notifications through HipChap. We also integrated it with an internal NOC (w’ve got other tools like Splunk on the NOC as well)

At the moment, we’d like to have a configurable reporting tool that allows us to get metrics and maybe some graph reports.

I’m new on the community forum, so I’ll be checking the documentation and sharing my experience with all of you.

Thanks

6 Likes
(Alex) #5

Hi all,

compared to the two other enviroment’s, our one is a little bit smaller :slight_smile:

We also reviewed some monitoring tools, but started with nagios first. We had a visit at the datacenter here in salzburg and we asked what monitoring system they use - they said icinga. So here we are :slight_smile:

We monitor 59 hosts and 803 services.

icinga daemon -C
information/cli: Icinga application loader (version: r2.10.3-1)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
information/ApiListener: My API identity: master.fqdn.example.com
information/ConfigItem: Instantiated 803 Services.
information/ConfigItem: Instantiated 1 InfluxdbWriter.
information/ConfigItem: Instantiated 1 IcingaApplication.
information/ConfigItem: Instantiated 59 Hosts.
information/ConfigItem: Instantiated 1 FileLogger.
information/ConfigItem: Instantiated 4 NotificationCommands.
information/ConfigItem: Instantiated 881 Notifications.
information/ConfigItem: Instantiated 1 NotificationComponent.
information/ConfigItem: Instantiated 16 HostGroups.
information/ConfigItem: Instantiated 1 ApiListener.
information/ConfigItem: Instantiated 1 Comment.
information/ConfigItem: Instantiated 1 CheckerComponent.
information/ConfigItem: Instantiated 22 Zones.
information/ConfigItem: Instantiated 20 Endpoints.
information/ConfigItem: Instantiated 2 ApiUsers.
information/ConfigItem: Instantiated 3 Users.
information/ConfigItem: Instantiated 1 IdoMysqlConnection.
information/ConfigItem: Instantiated 230 CheckCommands.
information/ConfigItem: Instantiated 1 UserGroup.
information/ConfigItem: Instantiated 8 ServiceGroups.
information/ConfigItem: Instantiated 3 TimePeriods.
information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
information/cli: Finished validating the configuration file(s).

The master is running on a HPE ProLiant DL380 Gen7 with InfluxDB and Grafana.
Checks are done though the icinga agent, ssh and snmp. Also using some own check plugins written in perl, python and C#.
The configuration is done though .conf files - for version controlling we are using git (gitlab).

In addition to the standard mail notifications we get also notifications though sms for cirtical hosts/services. Using a simple umts stick, with a sim-card and the gammu package for sending.

New monitoring objects get at the moment added manually. We are planning to do this automated via chef for installing/configuring icinga2 agents.

For our employees we made a little system status page, so they can have a look if something isn’t working. Data gets requested via API. At our office we have two monitors to know whats happening in our infrastructure - the first one displays an icinga2 dashboard and the second one grafana dashboards with a playlist.

Like the others said, the grafana module is also the favorite for us. Thanks to @Carsten!
Using also the x509 module for certs handling, thanks for that.
A build-in reporting tool would be nice, i already read that something is in progress - looking ahead to use it in the near future.

Thanks for reading and a big thanks to all who made icinga a great, maybe even the best monitoring tool - we love it!

Greetz

3 Likes
(Christian Moritz) #6

here is my enviroment.

monitoring stuff:

  • 180-190 network switches with snmp
  • 20 esxi hosts
  • 12 Firewalls
  • Voice Gateway’s
  • Branch Office Gateway’s / PSTN Breakouts
  • a couple of SAN switches
  • and other infrastructure.
  • and all the VM’s which are required for Infrastructure services…
    including DC’s, Exchange, Backup, Antivir and other infrastrucutre servers/services.

enabled Features / Modules:

  • Director
  • x509
  • vSphere

so a really bug thx to the team from netways which is “Head of Developlent” for the Icinga Features/Modules.

U guys/ and girls as well do a realy good Job. :star_struck:

3 Likes
(Kevin Honka) #7

after reading the entries, I decided to show mine as well :smiley:

Decision for Icinga

I followed the icinga2 development from the early days of the beta, when there was no icingaweb2. After switching companies, the new deparment had many issues related to none existent or spotty monitoring, so the first thing I was tasked with was to fix all the problems which included setting up a monitoring system. Naturally I decided to use Icinga2 as it seemed the most powerful and accessible tool with a nice community.

Icinga Setup

Our setup is rather small, but it only encompasses our department of 14 people, nonetheless we have currently around 59 VMs that are dedicated to our department, which all need to be controlled and monitored.

Workflows

Adding Hosts to Icinga2

Adding hosts is done automatically, when they are setup. We run ansible playbooks to fully configure our hosts, so it was a natural decision to also add them to icinga via a playbook, this is realised via a python script that talks to the director REST-API and adds them with configuration values for hostgroups, processes to survey etc.

Modifying Icinga Objects in the runtime

We have the need to modify Objects, without engaging the director. So I teamed up with @bodsch to port his API from ruby to python. you can find the API here: GitHub Repo it is currently rather stale, as I do not need any more functions.

Favorite Module

This is a hard one, as I really like the director and the grafana module and can’t decided which one has brought me more joy.

Integration

Icinga is currently integrated with:

  • Ansible
  • FreeIPA for authentication and soon Host Import

Missing Feature

I would really like a easy to use graphical interface for notifications, yes I’m looking at you NoMa :smiley:

3 Likes
(Nilesh) #8

Overview :-

We have hybrid cloud landscape which is combination of AWS Cloud & On Prem Infrastructure . As SRE our focus is to ensure 100% availability of all the OS |App | DB | Network| Middleware services. We had face lot of challenges in past due to gap in monitoring coverage. Main agenda behind implementing ICINGA is to ensure 100% monitoring coverage using continuous monitoring.

Coverage
Currently our entire infrastructure is getting monitored from ICINGA which consist of ~1300 servers & ~24000 services . ICINGA master is in High availability which is connecting to 08 satellites.

Technology
All the new monitoring is happening via continuous monitoring automation. This automation ensure that all new discovered hosts will get onboarded along with respective services.(e.g. automatic monitoring onboarding OS | App | DB | MW as per technology)

Key Integration

  • ICINGA – influxDB – Grafana : OS Metrics Dashboards | Component Dashboards (Redis | Zookeepr | NGINX | JVM etc)

  • ICINGA – OpsGenie : Auto escalation for Critical & Warning Alerts

  • Self Healing : Self Healing actions post critical alerts (automatic disk clearance | process restart) using Jenkins integration

Summary
We are looking forward to see out of box reporting capabilities from ICINGA . This is our primary monitoring tool & most of our incidents are getting captured proactively.

4 Likes
#9

What criteria did you weigh up in deciding to engage Icinga?
About three years ago we started evaluating on how to replace our existing monitoring setup (icinga1 (+centreon as config ui), that we used for us and our customers. We had some 3-4 different solution with which we (tried to) built a testing environment.
Icinga2 made the race because of the rewritten code and not being a nagios fork anymore. Also it was easy to set up, offered a modern looking interface (with many integrated modules), a out-of-the-box distributed/HA-option and (for us) the biggest selling point was the announcement of the Icinga Director.

How many hosts/services do you monitor? (Hint: icinga2 daemon -C )

icinga2 daemon -C
# icinga2 daemon -C
[2019-03-12 08:30:56 +0100] information/cli: Icinga application loader (version: r2.10.3-1)
[2019-03-12 08:30:56 +0100] information/cli: Loading configuration file(s).
[2019-03-12 08:30:56 +0100] information/ConfigItem: Committing config item(s).
[2019-03-12 08:30:56 +0100] information/ApiListener: My API identity: :)
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1822 Services.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 306 Hosts.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 195 Dependencies.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 15 NotificationCommands.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 2412 Notifications.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 53 HostGroups.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 7 Zones.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 6 Endpoints.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 3 ApiUsers.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 12 Users.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 258 CheckCommands.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 7 ServiceGroups.
[2019-03-12 08:30:58 +0100] information/ConfigItem: Instantiated 14 TimePeriods.

Describe your Icinga Setup roughly (High Available, Satellites, Agents etc.)
This monitoring system is just for our internal IT infrastructure and monitors our Windows servers (Exchange, SQL, System Center, ADFS, SOFS), some Linux servers, Cisco switches and ASAs, NetApps, Cisco UC equipment and various other hosts

  • two Master servers as a HA cluster in our headquarter, where most of our internal IT is located
  • three satellites in our bigger remote locations, that monitor the local internal IT infrastructure
    – the smaller locations get monitored via the Master servers
  • mariaDB on a separate server
  • graphite on a separate server

How do you add new monitoring objects? Describe your configuration workflow roughly.
There is no real automation going on. Windows servers get imported from our AD via Import/Sync from the Director. Anything else is added manually (though this happens rarely, as there is no big movement in out IT infrastructure).

Name your most favourite Icinga Module
:heart_eyes:Icinga Director:heart_eyes:

What’s that ONE piece your missing in Icinga?
The reporting module :smiley:

All in all I really like icinga2. It is nice/easy to set up, runs well, looks good, is highly modular/customizable, has a very good configuration interface :slight_smile: and a great community!
Thanks for developing it :+1:

3 Likes