Icinga2's Graphite or other object writers - Architecture of HA

fatslimjoe · August 3, 2023, 8:38am

Hello team,

I have a question regarding the concept we want to implement. Before I proceed to test it in the test environment, I was hoping someone with experience could share some insights.

Currently, we have a setup with 2 masters in a high availability (HA) configuration. Both masters are using graphite object writers, and there is only one instance of the graphite server where the object writers from both master environments store performance data.

Now, my question is: If we were to use two instances or two graphite servers for the object writers, would this create a “split brain” scenario? For instance, Master1’s object writer pointing to graphite-srv1, and Master2’s object writer pointing to graphite-srv2.

I noticed that icinga_ido is responsible for ensuring that both masters store data in both databases (master1 has db1, and master2 has db2), which means the data should be persistent even if one master goes down.

Alternatively, do we need to consider graphite clustering, where we only need to point icinga2’s object writer configuration to the IP address/FQDN of the graphite cluster, if we plan to use more graphite nodes instead of just one?

Thank you all for brainstorming and sharing your insights.

fatslimjoe · August 8, 2023, 2:00pm

Hi,

have update, have found following in documentation Object Types - Icinga 2

so it should be by default “enable_ha = false”? If it is not specified under object writer and this means both masters can activly store perf data into Graphites databse, so it will not be “split brain” scenario?

THX

Al2Klimov · August 29, 2023, 3:45pm

Hello Joe!

enable_ha means basically:

As long as connected, only one Icinga node writes perfdata to whatever Graphite is configured locally
As long as disconnected, both Icinga nodes (are in split-brain and) write perfdata to whatever Graphite is configured locally

Best,
A/K

fatslimjoe · August 30, 2023, 8:43am

Thank you for the feedback.

So, if we want the object writers from both HA satellites in Zone 1 to store data simultaneously in both graphite servers for each satellite, we need to set the option “enable_ha” to “false,” right?

When one satellite goes offline, there might be missing data for that period. Is there some kind of synchronization that occurs when it comes online again, allowing the offline satellite to receive the missing data from the working one?

Thanks. Sorry if these questions seem basic, but the concept of HA in icinga2 is still a bit unclear to me.

KR,
J.

Al2Klimov · August 30, 2023, 8:52am

Yes, enable_ha=false.
It should send missing stuff after being offline, but I’m not sure whether Graphite will still accept the delay due to its time slots concept.

fatslimjoe · August 30, 2023, 9:32am

Aham OK,

many thx for feedback. I think I have understood.

Thx a lot.
Cheers!