I’m running a distributed monitoring setup with the following modules:
Background:
Up till recently we had hosts which were flagged for after-hours notifications and those that were not. My organization is wanting to pivot and enable after-hours notifications for individual services in addition to hosts. Almost all services are applied to hosts via their service sets. I have a “Linux” service set that applies to objects in a “Linux” hostgroup. These amount to ~1000 services.
The Problem:
My existing configuration does not have a custom var/field for this on any service template for the apply-to rules of notifications to look for. We also utilize service sets wherever we can to automatically add baseline services to hosts when added to our inventory. Hosts already had this var due to how we presently send after-hours notifications. In nagios, we defined timeperiods directly in the service definition. Unless I’ve misunderstood something, Icinga2 does the opposite and wants to apply notification configurations based on the apply-to rules exclusively.
I was wondering what the shortest path forward is to get from where I am:
notifying for hosts/services based off of a host variable, meaning we notify for all services on an “after-hours” host.
To: configuration where I can easily disable/enable notifications on an individual service basis, enabling after-hours notification on a service-by-service basis.
Potential Option One:
My theory before being able to do any testing, is that I can add the field to a root service template like Generic-Service. I have configured this template to have the boilerplate settings for our normal checks that then are imported into each service template. I can then override the template as needed on a per-service basis (most services do not require after-hours notifications to be enabled). However, I believe if I am to do this, it would result in me having to individually override every single service on every single host we already have deployed. (see: a hellacious amount of work) I’m hoping I can use something like:
icingacli director service set * --vars.afterhours_notifications 'true'
If not, I’m hoping there is another means of leveraging the director/icinga APIs to mass change this service variable on these applied service set services. The biggest hitch of this one is the going back and updating existing services. This requires a retroactive change on ~1000 services.
Potential Option Two:
I duplicate the Generic-Service template, naming it “Generic-Service (After-Hours)”. Then import this into a fork of the templates, one after-hours, one standard business hours. I really hate this option as it will surely break existing overrides due to services applying differently/with new names.
If there are better paths forward, I’m all ears. I’m trying to avoid sinking several hours into clicking around a web interface changing a boolean from false to true on ~1000 services.
Thank you for your time, I understand this is a bit of an essay. I’m hoping I’m just overcomplicating my problem.