Preffered way to deploy icinga2 in 2024

davnie · October 4, 2024, 3:04pm

Hello everyone,

it has been some time since I deployed a new installation of icinga2. In the past, the obvious best way to deploy icinga2 was by installing the respective OS package and configure everything appropriately.

In the current infrastructure, the paradigm is to deploy everything as a docker container whenever possible. I’ve noticed that there are now also docker images for all relevant icinga2 components. At the same time, the documentation still only lists the OS packages under installation.

I found this discussion back from 2020 about whether it’s worth to use docker to deploy icinga2. However, many things changes since then.

So the question to those of you who intensively use icinga2 and mange installation: What would be your best way to deploy icinga2 in late 2024? Are the OS packages still the way to go or is docker the better alternative?

moreamazingnick · October 5, 2024, 5:12am

My preferred way is to run it baremetal and using packages.

Here are some thoughts:

severe problems with your hypervisor → no monitoring
severe problems with your kubernetes cluster → no monitoring
severe problems with docker → less help from the community
selfmonitoring the icinga-master using docker → I think you need a container with more privileges or you are monitoring for example the docker disk and not the system disk.
here is a thread about icinga in a container but it is quite old.
from my point of view docker is great for testing, like icingaweb2, or icingaweb2-modules with various php, influxdb versions.
If you have severe problems with your infrastructure and your monitoring is affected by that, it gets a lot harder to find the issues.
here is the blogpost of ansible-collection-icinga
and here is an interesting talk on icinga in kubernetes:

lorenz · October 7, 2024, 7:52am

Personal opinion:
Icinga in containers is a bad idea (most of the reasons were mentioned by @moreamazingnick ), I would still go for a OS level installation anytime.

(Also IMHO most docker/kubernetes setups which aim to replace a “classical” installation are a ticking time bomb in your setup)

rivad · October 7, 2024, 12:41pm

Configure and install via LFOPS.

The benefits are:

no manual installation and thus easy reproducibility
as Ansible delivers config as code it enables versioning and a paper trail

davnie · October 10, 2024, 3:11pm

Thank you very much for your thoughts on this! I saw the old post back from 2020, but wondered if the conclusion from back then still apply.

We will go with a non-docker solution and will try to take into account the mindset of making the monitoring as robust as possible.

davnie · October 10, 2024, 3:13pm

Thank you for your opinion on this. We will go with an OS level installation and try to take into account the hints by @moreamazingnick when possible.

Out of curiosity, when you say that you see most docker/kubernetes setups as ticking time bombs, do you mean that specifically with regards to monitoring or generally for almost all production level applications? (I hope this is not too off topic)

davnie · October 10, 2024, 3:17pm

Thank for pointing me to LFOPS! I didn’t know about this project before but it seems great!

At the moment I’m the only one in our team that is familiar with ansible. I’d like to use it more, but since others will also have to maintain the monitoring, I’ll try to introduce the team to it with some simple tasks first before. But this will be very helpful, thank you!

rivad · October 11, 2024, 7:54am

IMHO no deep Ansible knowledge needed in the team if the documentation is of some quality.

I’m also the only one proficient with Ansible, but the others can copy paste the existing configs in the Ansible inventory, register the change in git and run the ansible-playbook commands as I documented in our wiki.

lorenz · October 11, 2024, 11:27am

My perspective on docker/kubernetes setups is, that those technologies were and are overhyped and often seen as a “one size fits all” solution for anything.
The “old school” setup based an operating system distribution and packages has a lot problems and complexities, it is hard to actually understand how it works, what is happening and how things should be done right.

But the misconception I see here is that something based on docker/podman/kubernetes replaces all of that, mostly it builds on top and often conflicts with the existing systems. Instead of replacing the need to know how to a specific OS works and how components interact there with the need to know how docker/kubernetes/podman/whatever works, you now have to know how both of those things work.

I do have limited insight in the whole container ecosystem and existing setups, but from what I have seen, often the “nicely containerd setup” does not consists of carefully engineerd bare-minimum containers with only the necessary libraries and carefully watched dependencies but of bunch of Ubuntus in varied versions where someone just did enough to get some specific piece of software running on top.
At this point you replaced your already complex system of different dependencies and a lot of interactions with one where the number went up by an order of magnitude. Instead of patching the one vulnerable instance of a library, there are now probably about six around.
And since a lot of these containers are just downloaded from dockerhub, you would have to examine every single one of them, upgrade them, patch them, and rebuild them.
Of course there also changes in the runtime itself from time to time which have to be taken care of.

But updating is not the only problem, I once had a broken Icinga setup where the problem was a full volume for the Icinga container.
Full filesystems are still one of the most common causes of system failures and instead of having one or two of those, where most admins know how to detect that, you now have five or then and most admins have no idea about how to handle that and all that namespace magic.
The usual check_disk Service on the container hypervisor did not see or watch the container volumes properly and nobody thought about it.

My impression is, that deployment with docker is fairly easy, which makes it very attractive, but maintenance and debugging are harder, due to the increased complexity.

Regarding kubernetes (and derivatives and similar systems), they are great and cool and fix a problem, where you have to coordinate a lot of components and ressources and have to be very flexible and responsive (automatic scaling of web servers based on demand on such things).
But that’s a very specific problem for specific systems. The common sentence here is “You don’t have Google problems” and that is still true for most setups which are rather static and should just work in the same way most of the time.

From the lines above one might get the impression, that I resent container technologies and powerfull ressource management systems (kubernetes) in general, but this is not the case.
Containers are great for certain things, for example software testing and packaging, since you start every run with a “clean” system which is always the same (ideally). Also, for practical reasons, having a few containers running the odd old production critical service which needs an Ubuntu 14.04 or something like that is probably the most managable way to do it.
And kubernetes is probably a good way to construct complex systems from a central control point and a cool system to thing and control ressource management.
But don’t just jump for it, because it’s “the way things are done now” or everybody seems to talk about it. Good infrastructure is boring and there is nothing much to talk about, because it works.

Regarding Icinga specifically, containerd setups are tricky, icinga2 heavily relies on the Monitoring Plugins to monitor stuff, therefore they must be available in the container somehow. Scaling, replication and so are done by icinga2 itself internally, trying to do that with kubernets just gets you two conflicting philosophies and a non functional system.
Separate the web interface from the core makes the most sense until you want to use some of the modules as Monitoring Plugins (“director health checks”, “businessprocess”, “x509”), since those interfaces are not available via HTTP API but only via local execution.
The Icinga stack as whole does not fit nicely in the way docker/kubernetes/etc work, it is a system designed more than a decade ago and with the interfaces and strategies of the time in mind.

Sorry for the long post, I hope it sheds some lights on my opinions here. Feel free to ask if there is something unclear (it made sense in my head )