Master node as multi-site aggregator

Hi,

I’m investigating the functionality of Icinga to see if it would be a good replacement for our current setup.
The overall architecture I’d be looking at would be a single master (possibly in HA), and a set of satellites - nothing special here. Due to how we’re doing configuration management though, we can’t approach the setup in a top-down fashion; it would instead be very idiomatic for us to store checks on the satellites as plain text files, and use the master purely as an aggregator. In other words, I would want something more like a multi-master + aggregator architecture instead of master and satellites.

I have been reading several threads and docs and I could so far determine the following:

…and I thus have a few questions:

  • is a master able to not just display checks that are stored on satellites, but also communicate back to them (e.g. to issue downtimes)?
  • if not, could I use a centralized icingaweb2 to connect to multiple IDO DBs?
  • is a multi-master, multi-IDO setup something I should not rely on, because it might be deprecated at some point?

Thanks!

Hello and Welcome.

In your proposed scenario the master is the “aggregator” and thus will be the place when you can see the status and control the sending of notifications so the downtime definition on the master will cover that.

I am unsure why you’d want an independent IDO for each satellite, if you want to aggregate the data, surely a single database is the way to go and you can use HA functionality for the database for redundancy.

Maybe some interesting stuff to read for you:

As @aflatto said, the master will be able to distribute downtimes etc.
That is basically what the master does: Receive check results from the satellites, distribute the config, dish out commands/downtimes/comments to hosts/services, send notifications.

Maybe the schema drawings I made some time ago also help you :slight_smile:

The various zones are in separate geographical regions and belong to separate customers. From a configuration management perspective, they are in separate, non-communicating domains. Additionally, all the zones already have properly sized and redundant databases, whereas the master would live in a much more resource-constrained environment and should thus try to be rather lightweight. Finally, connectivity from the master to the satellites can at times fail, so the separate zones must be able to operate in full autonomy.

Effectively, zones should be separate icinga deployments, but for operational convenience we also need a centralized dashboard (hence my mention of thruk).

Yeah, I did read that thread and it’s good to hear that a single database can handle that many checks, but the nature of our deployments do require a more distributed approach.

Thanks for the replies! I will run some tests in the lab to see how icinga reacts to having separate IDO DBs.

Ok, I think I’ve hit a roadblock: the master node must have an IDO, which is something I would want to avoid. If that’s correct, livestatus + thruk would remain the only way for me to aggregate separate sites, but the fact that livestatus might be deprecated soon gives me a lot of pause :confused: