Scheduling downtime via API getting overwritten

ccc2lu · April 16, 2024, 8:04pm

I’m trying to schedule downtime for a host (all services) via the icinga2 API. We’re using the icingadb module, which I understand is mutually exclusive with the older monitoring module, so its /icingaweb2/monitoring/host/schedule-downtime interface for scheduling downtime through icingaweb2 isn’t available to me.

When I schedule the downtime directly on one of our icinga2 nodes, it shows up briefly in /icingaweb2/icingadb/downtimes, and then goes away. Something seems to be overwriting it. My suspicion is that this is icingaweb2 (or the director module) doing this somehow because I didn’t schedule the downtime through it, I did it directly on the icinga node. But so far as I can see, there is no API way to schedule downtime through icingaweb2 or director when using the icingadb module, so the only way to do it via API is directly on the node with the icinga2 API. We have 2 nodes called icinga3 and icinga4. icinga3 is the “main” one where changes are made, and then sync’ed over to icinga4. Does anyone know if there’s a way to schedule downtime via API in this setup? Or if there’s a way to get icingaweb2 to pick up on downtime scheduled directly on a node this way? Thanks!

Icinga and OS Version info from “icinga2 --version”:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.6-1)

Copyright (c) 2012-2024 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later https://gnu.org/licenses/gpl2.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
Platform: Debian GNU/Linux
Platform version: 12 (bookworm)
Kernel: Linux
Kernel version: 6.1.0-10-amd64
Architecture: x86_64

Feature information from “icinga2 feature list”:
Disabled features: command compatlog debuglog elasticsearch gelf graphite influxdb influxdb2 livestatus opentsdb perfdata statusdata syslog
Enabled features: api checker icingadb mainlog notification

Icinga Web 2 version and modules (System - About)
Icinga Web 2 Version 2.11.4
PHP Version 8.2.18

Config validation from “icinga2 daemon -C”:
[2024-04-16 16:00:56 -0400] information/cli: Icinga application loader (version: r2.13.6-1)
[2024-04-16 16:00:56 -0400] information/cli: Loading configuration file(s).
[2024-04-16 16:00:56 -0400] information/ConfigItem: Committing config item(s).
[2024-04-16 16:00:56 -0400] information/ApiListener: My API identity: icinga3.our.domain
[2024-04-16 16:00:57 -0400] warning/ApplyRule: Apply rule ‘Disk space’ (in /var/lib/icinga2/api/packages/director/2340b00a-b2c6-499e-9c6d-0e8b076f7ea3/zones.d/director-global/servicesets.conf: 75:1-75:26) for type ‘Service’ does not match anywhere!
[2024-04-16 16:00:57 -0400] warning/ApplyRule: Apply rule ‘Load’ (in /var/lib/icinga2/api/packages/director/2340b00a-b2c6-499e-9c6d-0e8b076f7ea3/zones.d/director-global/servicesets.conf: 85:1-85:20) for type ‘Service’ does not match anywhere!
[2024-04-16 16:00:57 -0400] warning/ApplyRule: Apply rule ‘Memory’ (in /var/lib/icinga2/api/packages/director/2340b00a-b2c6-499e-9c6d-0e8b076f7ea3/zones.d/director-global/servicesets.conf: 95:1-95:22) for type ‘Service’ does not match anywhere!
[2024-04-16 16:00:57 -0400] warning/ApplyRule: Apply rule ‘APT’ (in /var/lib/icinga2/api/packages/director/2340b00a-b2c6-499e-9c6d-0e8b076f7ea3/zones.d/director-global/servicesets.conf: 105:1-105:19) for type ‘Service’ does not match anywhere!
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 NotificationComponent.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 CheckerComponent.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 3 Users.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1697 Services.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 3 Zones.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 3 NotificationCommands.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 583 Notifications.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 5529 Hosts.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 IcingaApplication.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 7 HostGroups.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 2 Endpoints.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 FileLogger.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 3 ApiUsers.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 233 CheckCommands.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 ApiListener.
[2024-04-16 16:00:57 -0400] information/ConfigItem: Instantiated 1 IcingaDB.
[2024-04-16 16:00:57 -0400] information/ScriptGlobal: Dumping variables to file ‘/var/cache/icinga2/icinga2.vars’
[2024-04-16 16:00:57 -0400] information/cli: Finished validating the configuration file(s).

zones.conf file:
/*

Generated by Icinga 2 node setup commands
on 2023-08-08 16:07:29 -0400
*/

object Endpoint “icinga3.our.domain” {
}

object Endpoint “icinga4.our.domain” {
host = “icinga4 IP address here”
}

object Zone “master” {
endpoints = [ “icinga3.our.domain”, “icinga4.our.domain” ]
}

object Zone “global-templates” {
global = true
}

object Zone “director-global” {
global = true
}

ccc2lu · May 6, 2024, 4:46pm

Following up on my own post to share what the problem turned out to be. After our API call to schedule downtime directly on an icinga node, we were making a subsequent API call to director’s director/config/deploy endpoint to deploy a changeset. This made sense for all the other API operations we we were using that did go through director’s API, but not in this case. This is why our changes were being overwritten on the icinga nodes.

We eliminated the extra call to director’s deploy endpoint (just when scheduling downtime, it’s still there for everything else) and the downtime stopped going away immediately.

HOWEVER - the downtime we schedule directly on a node this way does get overwritten as soon as we do a deploy from director, via API or in the icingaweb interface. So if we schedule downtime for some host on an icinga node via API, and then update a comment or tag or anything on anything else (any host or service, not just the one we scheduled downtime on), the downtime gets overwritten still. So the problem isn’t really solved, just identified. It remains an open issue how we can schedule downtime via API when using icingadb and director.

rivad · May 7, 2024, 10:41am

We also see a lot/all information with the Icinga2 API as source getting deleted at config reload.

moreamazingnick · May 7, 2024, 3:48pm

is it evertime or with specific changes in director?

if you change the hostname of something the downtime for this object and it’s services are no longer present these downtimes disappear because the object is no longer the same…

rivad · May 14, 2024, 7:50am

It looks like it’s happening every time

I already lost API users that now use something else to provisioning there monitoring after Ansible runs.

mrdsam · May 14, 2024, 7:24pm

I had a similar (weird) problem, maybe it is related to yours. I had a small configuration mistake with my nodes that didn’t show up using IDO but with icingadb.

Try the following: On both root nodes (icinga3 and icinga4), issue the following commands:

icinga2 daemon -C --dump-objects #(updates the object cache for the next command)
icinga2 object list --type Downtime | grep ^Object | less

Then, look whether the (missing) objects are really missing on both nodes or just on one. If the objects are missing on both nodes, I can’t help you