High Availability Setup for Icinga Web with Multiple Masters and Shared Databases

Itay-Aszodi · April 24, 2024, 2:10pm

Hello Icinga Community,

I am exploring the possibility of setting up a high availability (HA) environment for Icinga, specifically focusing on Icinga Web. I would appreciate your insights or any guidance on this.

Current Setup:

Two Icinga master nodes: master-a and master-b.
Three external databases shared by both masters: Director, Icinga Web, and IcingaDB.

Understanding: From my current understanding, Icinga’s HA capabilities support IcingaDB but there isn’t clear support for Icinga Web under a HA configuration.

Question: If I configure master-b to run Icinga Web, pointing it to the same three databases as master-a, can I effectively achieve a de facto HA setup for Icinga Web by using a load balancer to manage traffic between the two Icinga Web instances on each master?

Would this setup ensure that both the web interface and backend remain highly available and consistent across both nodes, or are there potential issues with session handling, configuration synchronization, or other factors that I should consider?

Thank you in advance for your help and advice!

rivad · April 24, 2024, 2:14pm

We run Icingaweb2 with a load balancer in front but it’s configured to always direct the users to the primary master and only do failover to the secondary master.

The icingaweb2 modules aren’t HA aware so don’t run the accompanying systemd services on both masters simultaneously.

Also some icingaweb2 settings aren’t in the database. We solved this by having /etc/icingaweb2 in a git repository and automatic commit via incron and a commit hook will connect via ssh to the secondary master to pull the changes.

rivad · April 24, 2024, 2:58pm

Also the .git directory isn’t unter /etc/icingaweb2 as this would result in a endless loop of commit as the commit would also trigger incron. This isn’t a big problem as git allows to separate the git folder from the working dir (git init --separate-git-dir).

Itay-Aszodi · April 24, 2024, 3:08pm

Thank you for sharing your approach. It’s very informative and helps.

Regarding your setup, could you elaborate on how the systemd services are configured to handle failover? Specifically, how is it determined automatically to switch the service on or off when the primary server becomes unreachable or fails?
(for example graph module is not runing on both nodes?)

Additionally, if I understood correctly, /etc/icingaweb2/ configurations across both masters should be identical.

Again, thanks

Itay-Aszodi · April 24, 2024, 3:10pm

my draft for the Icinga architecture i thought on:

rivad · April 24, 2024, 3:20pm

I simply disabled the systemd services on the secondary master.
I also didn’t install the director at all on the secondary master so that I will notice the failover.

Grafana and the icingaweb2 grafana module is installed on both masters but the InfluxDB is external. This also simplifies HTTPS security options.

Yes, the /etc/icingaweb2/ is kept in sync. So the user have there Dashboards on both and the icingaweb2 business process module configs are also available and all changes are recorded in the git history - sadly without context. I would be nice if icingaweb2 would trigger the commit with the user and action in the commit message.

Itay-Aszodi · April 24, 2024, 7:22pm

Thanks for the details on your setup. The use of Git to sync /etc/icingaweb2/ across masters is clever.