Live Master + Test Master?

Hi.

I have a setup with an Icinga Master monitoring several groups of Slaves (no Satellites involved).

The Slaves are grouped into Dev(elopment), Test and Live.

When my Dev Slaves got upgraded from Icinga 2.10 to 2.11, they stopped running some (not all) checks correctly, so I downgraded them again and stopped the Master, and the Test and Live Slaves from upgrading.

I think that the correct way to perform this upgrade would be to do the Master machine first, but I’m nervous about doing this simply because it is the Live Icinga server, so I’m wondering - is there a way I can have a Test Icinga Master server operating alongside the Live Master, both talking to the same back-end Slaves and getting all their monitoring updates, so that I can upgrade the Test Master and check that it all works okay, before being confident to upgrade the Live master (which other people are watching, not just me) once it looks stable?

Basically, I want to do a controlled roll-out of Icinga 2.11, but in a way which doesn’t break my (currently working) 2.10 Master, and doesn’t lose any monitoring history if there’s a problem.

What have other people done in this situation for a production environment where downtime or breakage of the monitoring system itself is really unacceptable?

Ideas and/or case studies welcome :slight_smile:

Thanks,

Antony.

If your “Slaves” are configured with icinga core aka icinga agent than it’s not possible cause icinga core only allows one parent.

Very common and even recommend is a VM running icinga. With that you could simply clone the VM, run upgrade and check if everything is working.

I prefer another way: I’ve a develop, a test and productive environments. I always start tests in the develop environment and then in the test environment. Only if everything looks ok I head over to the productive sites. All installations have backup solutions in place, but for comprehensive changes e.g. 2.10.x to 2.11 I use temporary snapshots additionally.