Event streams between Icinga masters

We have a situation where we have 2 sites and want monitoring for both sites to continue in the event that either site suffers a total external connectivity failure.

We would also like independent Icingaweb2 instances that both show all hosts and services, the assumption being that when a site went down the other sites hosts and services would fail or not run.

The only solution I can see for this when using Icinga2 and IcingaDB is to have 2 totally seperate Icinga2 clusters because, as I understand it, IcingaDB needs to share a database in a Icinga2 HA master senario and db clusters need odd numbers of hosts, quorum’s and can’t manage split brain even if they could be forced to run with over 50% of host missing from the cluster.

My questions are:
Am I correct in assuming IcingaDB can’t have seperate DB’s or heal from split brain if they can have seperate DB’s?

Is there a way to configure the Event stream on each independent Icinga2 master to passively update the other masters hosts and services if they are configured by directly point it at the other sites Icinga2 master?

I know I can subscribe to the event stream from each Icinga2 master, process the data received myself (flask app, etc…) and then create my own api calls to the other Icinga2 master to push passive results, I’m wondering if I can just point the firehose at the other host and it is smart enough to ingest it itself?

Could you split the DB, the masters and the icingaweb2 hosts over both sites?

I guess, that satellites and zones are required to cleanly separate the checking between the sites.

The HA logic should take care of managing the connection failure but it will be a split brain scenario as a quorum isn’t possible with 2 nodes.

Would a setup like this be a viable option for you?

Central Single Master with IcingaDB & webinterface
---> satellite in site1 with a local icingadb and webinterface
|
---> satellite in site2 with a local icingadb and webinterface

This way you use the master instance to have the overall view of your monitoring and both satellites with the view for only their own zone.

The satellites will cache their results (default 1d), if they can’t send them to the master.

This is what we currently do, unfortunately when we tested failing over to the DR site they lost all connectivity and the galera quorum was on the primary site so nothing on the DR site had any monitoring.

I’m not sure what you mean here when you say

Central Single Master with IcingaDB

plus

satellite in site1 with a local icingadb
satellite in site2 with a local icingadb

Are there 3 independent icingadb’s in this senario, with the satellites feeding their data into their own Icingadb’s?
I didn’t think this was possible but if it is I would love to know how it works.

Yes, each satellite can have their own icingadb instance (and webinterface).
By this you can have separate webinterfaces for your satellite zones with only the visibility of the zone.

The satellites will still forward their data to the master, where you then have the complete overview of your infrastructure.

Thanks @log1c I’ll give this a go and see if it give us the right redundancy profile.