Tearing my hair out & losing the will to live

OK, so I’ve used Nagios in the past and I’ve installed loads of software packages on Linux. I wouldn’t call myself an expert but I’m far from a noob but the install instructions are just all over the place. I totally failed to get a working system and eventually flattened my OS, re-installed and followed a great guide in the HowTo section and with a little tweaking of that guide got a server running.

Great, now I can monitor the monitor system. Next task was to get an external agent to run. Wow! Here we go again, I could not get a Linux agent to do anything useful and it would not check in but did not error in its logs anywhere.

OK, do not despair, let’s try the Windows Agent, that should be easy, guess what, that doesn’t work either but at least now I have something in the logs in the server, it’s complaining about a self signed certificate so I do the dance
icinga2 ca list
icinga2 ca sign myfingerprint.

Still fails, still says it a self signed certificate and no amount of icinga bouncing (At this stage I’d happily bounce it out of a window on the 40th floor) or certificate resigning does anything. Checking the forums this doesn’t seem to be an new issue but also doesn’t seem to have a definitive resolution.

[2020-08-07 15:57:01 +0100] information/ApiListener: New client connection for identity ‘DESKTOP-ST0BD8I’ from ]:54910 (certificate validation failed: code 18: self signed certificate)
[2020-08-07 15:57:01 +0100] warning/ApiListener: No data received on new API connection from []:54910 for identity ‘DESKTOP-ST0BD8I’. Ensure that the remote endpoints are properly configured in a cluster setup.

ubuntu@monitoring:~$ sudo icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.11.4-1)

Copyright © 2012-2020 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl2.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
Platform: Ubuntu
Platform version: 20.04.1 LTS (Focal Fossa)
Kernel: Linux
Kernel version: 5.4.0-42-generic
Architecture: x86_64

Build information:
Compiler: GNU 9.3.0
Build host: runner-ltrjqz9n-project-298-concurrent-0

Yeah, the learning curve is steep and it’s crucial to understand the concepts behind distributed monitoring.

In general, sometimes confusing:

  • you need to define zone and endpoint objects in zones.conf only (using V2.11 - haven’t been checking if this was changed again with V2.12).

  • each agent needs its one zone and hostname, zone and endpoint object needs to be identical

I have one server so surely zone & endpoint would already be the same and set as a default?

I’m using director, wouldn’t the lack of any of this information be flagged?

I guess my major stumbling block and reason for higher expectations is that I already use Connectwise Automate for monitoring my Windows clients, having installed the server I just install an agent and it works from anywhere.

Hello @Bassman,
Sorry you are having such a hard time. Please do not tear your hair out anymore. As you get older it will disappear all by itself. From the sounds of it you have already reviewed the online documentation about agent setup.

Do you receive any errors when running the the validation (icinga2 daemon -C) command on your linux agent and icinga2 master? What features (icinga2 feature list) do you have enabled on your master & agent ? Can you share you zone.conf file from you master & agent?

Regards
Alex

As I said, I’d given up on the linux agent and went for the Windows one.

On the server:

ubuntu@monitoring:/usr/lib/nagios/plugins$ sudo icinga2 daemon -C
[2020-08-07 22:23:26 +0100] information/cli: Icinga application loader (version: r2.11.4-1)
[2020-08-07 22:23:26 +0100] information/cli: Loading configuration file(s).
[2020-08-07 22:23:26 +0100] information/ConfigItem: Committing config item(s).
[2020-08-07 22:23:26 +0100] information/ApiListener: My API identity: monitoring.**********.co.uk
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 HostGroup.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 2 Hosts.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 3 Zones.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 Endpoint.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 ApiUser.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 236 CheckCommands.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 TimePeriod.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 1 User.
[2020-08-07 22:23:26 +0100] information/ConfigItem: Instantiated 2 Services.
[2020-08-07 22:23:26 +0100] information/ScriptGlobal: Dumping variables to file ‘/var/cache/icinga2/icinga2.vars’
[2020-08-07 22:23:26 +0100] information/cli: Finished validating the configuration file(s).

ubuntu@monitoring:/usr/lib/nagios/plugins$ sudo icinga2 feature list
Disabled features: compatlog debuglog elasticsearch gelf graphite influxdb livestatus opentsdb perfdata statusdata syslog
Enabled features: api checker command ido-mysql mainlog notification

On the Windows workstation:

C:\Program Files\ICINGA2\sbin>icinga2 daemon -C
[2020-08-07 22:26:39 +0100] information/cli: Icinga application loader (version: v2.12.0)
[2020-08-07 22:26:39 +0100] information/cli: Loading configuration file(s).
[2020-08-07 22:26:39 +0100] critical/cli: Could not compile config files: Error: Function call ‘std::ifstream::open’ for file ‘C:\ProgramData\icinga2\etc\icinga2/icinga2.conf’ failed with error code 13, ‘Permission denied’

C:\Program Files\ICINGA2\sbin>icinga2 feature list
critical/Application: Icinga 2 has terminated unexpectedly. Additional information can be found in ‘C:\ProgramData\icinga2\var\log\icinga2/crash/report.1596835662.032000’

There is no zone.conf file on either system.

The Windows client was the latest version and I have admin rights on the PC. There were no error reported during the setup but clearly it’s not happy.

Hello !
No wonder why you have so much troubles at installation, ubuntu 20 is not in the compatibility matrix, so it requires icinga or system tweaking to get it to work.

For your windows version, i dont know, but you should check as well.

Related doc :

The link to this doc is at the very start of the installation process :
https://icinga.com/docs/icinga2/latest/doc/02-installation/#setting-up-icinga-2

That’s nice but not obvious and not actually in the installation docs. It’s also actually the support matrix for paid support which I don’t have so it’s irrelevant. It simply means that if you install on a version they don’t list, they won’t provide support, it does not mean that it won’t work nor that it is so flaky that it will collapse into a smouldering heap of poo if you should dare to upgrade or try to use a more recent OS version.

Furthermore, throughout the installation docs it makes reference to different Centos versions (6, 7 & 8) but only ever says Debian/Ubuntu with no version specific instructions for any of those. It’s therefore not unreasonable to assume that any current and not EOL version of Debian/Ubuntu can be used.

If I’ve missed version specific instructions for Ubuntu in the docs I’m happy to be proved wrong. I could not find any section that said of Ubuntu; “only use this version” or “never use that version”

So in an effort to stop my hair falling out and as local monitoring is working I decided to turn my attention to notifications. I ran through a simple check list.

Does the mail command work? Yup, it does.
Does mail-host-notification.sh work? Yes indeedy it does!
Does mail-service-notification.sh work? Yes, mail received loud and clear.

OK, lets try sending a notification from the web interface, add a comment, force send.

Radio silence.

Zilch

Nothing

Nada

Check the log and no error

[2020-08-09 01:00:50 +0100] information/HttpServerConnection: Request: POST /v1/actions/send-custom-notification (from [127.0.0.1]:34844), user: root, agent: ).
[2020-08-09 01:00:50 +0100] information/HttpServerConnection: HTTP client disconnected (from [127.0.0.1]:34844)

I’m getting to the end of my tether with this hobbyware.

Another problem as I try to find out why a simple task like sending an email is challenging to this software.

[2020-08-09 11:31:36 +0100] information/cli: Icinga application loader (version: r2.11.4-1) [2020-08-09 11:31:36 +0100] information/cli: Loading configuration file(s). [2020-08-09 11:31:36 +0100] information/ConfigItem: Committing config item(s). [2020-08-09 11:31:36 +0100] information/ApiListener: My API identity: monitoring.mydomain.co.uk [2020-08-09 11:31:36 +0100] critical/config: Error: Validation failed for object ‘centos!check_load!my-email-service’ of type ‘Notification’; Attribute ‘command’: Object ‘mail-service-notification’ of type ‘NotificationCommand’ does not exist. Location: in [stage]/zones.d/master/notification_templates.conf: 5:5-5:41 [stage]/zones.d/master/notification_templates.conf(3): begin = 30s [stage]/zones.d/master/notification_templates.conf(4): } [stage]/zones.d/master/notification_templates.conf(5): command = “mail-service-notification” ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [stage]/zones.d/master/notification_templates.conf(6): interval = 5m [stage]/zones.d/master/notification_templates.conf(7): period = “7x24” [2020-08-09 11:31:36 +0100] critical/config: 1 error [2020-08-09 11:31:36 +0100] critical/cli: Config validation failed. Re-run with ‘icinga2 daemon -C’ after fixing the config.

Apparently Object ‘mail-service-notification’ of type ‘NotificationCommand’ does not exist

If it doesn’t exist, then what is this being used in a notification template???

I see two mistakes:

Second, your agent is newer than its parent but this is not supported.

“It generally is advised to use the newest releases with the same version on all instances”

Only advised, not unsupported. If it was unsupported as in not working, then there would be no point in releasing new versions of the agent as they could never be used.

My experience of software is that generally it is advised to use the latest version where possible unless there is clear indications that two versions would not be compatible such as a change in communication protocols etc. I can;t see that this is the case here.

You are receiving a “Permission denied” error when running the command on your Windows box. Are you running this command with administrator permission?

The zones.conf file is included with the Icinga application installation. Please search your Linux box for the zones.conf file. Each manufacture of Linux places it in a different stop ( /etc/icinga2/zones.conf ) .

The zones.conf file is located at “C:\ProgramData\icinga2\etc\icinga2\zones.conf” on your Windows box.

Please review the online documentation on how to configure the zones.conf file for your setup.

Alex

I can only recommend you to have at least the same major version (eg 2.12.x) everywhere, this will saves your a lot of troubles since the messaging for json rpc can change and may be interpreted differently from one version to another, this is even more true since the doc recommend you to upgrade master first, not agent (your agent is newer than your master)

Also, from the doc :
https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#versions-and-upgrade

Older agent versions may work, but there’s no guarantee. Always keep in mind that older versions are out of support and can contain bugs.

In terms of an upgrade, ensure that the master is upgraded first, then involved satellites, and last the Icinga agents. If you are on v2.10 currently, first upgrade the master instance(s) to 2.11, and then proceed with the satellites. Things are getting easier with any sort of automation tool (Puppet, Ansible, etc.).

I’m confused. I followed the installation docs and used the repositories provided yet somehow I’ve ended up with 2.11 on my master and 2.12 on my agent. So who is wrong, me the documentation everyone keeps pointing me at?

2.12 is very recent, it was released 7 days ago, if it happened you installed your master before 2.12 release and your agent afer, it sounds pretty normal to me you ended up in this situation.

If you think documentation is wrong, you are welcomed to send a PR here :kissing_heart:

… which is not ok (as I already mentioned).

Wow! so a 0.01 point release difference can break this monitoring system?

Wow, just wow.

HI, maybe its a good idea to post the complete process, of how you configure the agent. If you use ca list and sign, you need to copy the ca.crt from the master pki before you configure the certificates on the agent. The ca sign process is not necessary when you provide a ticket for the node wizard.

There are 3 Ways to generate the certificate:

  1. with a ticket to get it from the master
  2. Create it on the master and copy it over
  3. copy the ca.crt and “auto sign” a request with ca list/sign.

I’am using the 3. approach, as for that, the agent does not need to be able to connect to master/satellite, as the firewall would prevent that in our environment. also its very easy to use for automation.

Edit: icinga has so many ways to configure something, so its sometimes hard, because if you are at the beginning, its easy to mix up the approches while googeling for a solution.

for the notifications:

the notifaction services are in conf.d on the master. If you commented that out, it will be missing. Also you need to run the import wizard on the director to import them into it. I would comment out conf.d and move the notification configuration to Master or global-templates.

As said, therw are many ways to configure icinga. In bigger or agent based setups, i think its not a good idea to use conf.d, as thats just an example to get goining on a
single Icinga instance.

Hello @Bassman

I understand that getting into Icinga can be very difficult. The documentation is extensive, and not as streamlined and easy to follow as we would like it to be.
Improving is an ongoing process and we are constantly working on making it easier to navigate the world of Icinga in a way that is less frustrating.
Please feel free to open issues and pull requests on GitHub, if you feel like you know how to improve a section.
If you get stuck somewhere, feel free to open topics here and I am sure you will find help.

I can also understand that it becomes very frustrating, very fast, if things don’t work the way you expect them to. And I also get the need to vent about it.
Nevertheless I point you to our Code of Conduct.

Please pay more attention to your wording and whether it is comes across as insulting or mean spirited.

.

2 Examples of what I mean:

Is hurtful and derogatory towards all the developers, contributors, helpful community members giving their all, to provide you with this software. We are all humans, we have feelings, and we are trying our all to make Icinga the best we can.

.

As I said, I understand the frustration, but plain unmasked sarcasm towards other users, that are just trying to help you with your problem, is not something I can condone.

.

Consider this an official warning and please keep the discussion civil, fair and respectful.

Regards,
Feu Mourek

6 Likes

I stand by my comments, there is no way this software is ready for professional use in a production environment, therefore it is of interest only to the hobbyist. I speak from experience of working in IT for over 40 years and running an IT support company that uses monitoriing as part of its core offering.

There was no sarcasm intended in my comment regarding a 0.01 release breaking the system. I was told by a forum user that having this much difference between agents and master would be enough to break it and that shocked me. Am I not allowed to respond to such bizarre claims?

Warn me, kick me out and throw your toys out of the pram, I really don’t care, it won’t fix the many, many issues I have encountered.