Notifications sent during scheduled downtime

brianmc · August 28, 2020, 8:30am

HI all! I would appreciate any pointers on the below.

I have two hosts configured identically, with downtime scheduled every morning from 04:00 - 04:45 to allow backups to take place.

For one host, this works fine. For the other, notifications are sent despite being in scheduled downtime. What are the conditions that might cause this?

brianmc · August 28, 2020, 8:32am

I forgot, I’m running r2.12.0-1 on Ubuntu 16.04.

vars.backup_downtime = "04:00-04:45"

apply ScheduledDowntime "backup-downtime" to Service {
  author = "icingaadmin"
  comment = "Scheduled downtime for backup"

  ranges = {
    monday = service.vars.backup_downtime
    tuesday = service.vars.backup_downtime
    wednesday = service.vars.backup_downtime
    thursday = service.vars.backup_downtime
    friday = service.vars.backup_downtime
    saturday = service.vars.backup_downtime
    sunday = service.vars.backup_downtime
  }
  assign where service.vars.backup_downtime != ""
}

stevie-sy · August 28, 2020, 8:52am

Hi,
could you show us also the host definition of the two hosts and the notification rule?

brianmc · August 28, 2020, 10:15am

After a bit more searching, I suspect this might be the cause: Scheduled downtime in with HA masters

We have a HA solution, with accept_config = false on primary master.

brianmc · August 28, 2020, 10:20am

Service config:

object Host "XX" {
  check_command = "dummy"
  vars.public = true

  vars.ooh_sme = "systems"
  vars.sidebar = "e-services"
}

object Service "XX YY" {
  import "generic-satellite-service"
  host_name = "XX"
  check_command = "check_login"
...
  vars.backup_downtime = "04:00-04:45"
}

Notification config:

apply Notification "sms-service-systems" to Service {
  import "sms-service-notification"

  users = [ "systems-sme-user" ]
  period = "systems-rota-timeperiod"

  types = [ Problem, Recovery ]
  states = [ OK, Warning, Critical, Unknown ]
  interval = 0 # disable re-notification

  assign where "systems-sme" in service.host.groups || "systems-sme" in service.groups
}

object User "systems-sme-user" {
  import "generic-user"
  enable_notifications = true

  display_name = "Systems SME"
  email = ZZ
  # pager number represents flag given to sms-gateway
  pager = "-s"
}

object TimePeriod "systems-rota-timeperiod" {
  import "legacy-timeperiod"

  display_name = "Systems SME rota notifications"
  ranges = {
    "monday"    = "00:00-09:00,17:30-24:00"
    "tuesday"   = "00:00-09:00,17:30-24:00"
    "wednesday" = "00:00-09:00,17:30-24:00"
    "thursday"  = "00:00-09:00,17:30-24:00"
    "friday"    = "00:00-09:00,17:30-24:00"
    "saturday"  = "00:00-24:00"
    "sunday"    = "00:00-24:00"
  }
}

stevie-sy · August 28, 2020, 10:31am

Hmm… my idea would be that a assign rule for the host or service group is wrong or at the second host is a variable missing. What I mean that one host is in the group and one not. The result would be doesn’t work correctly:

assign where "systems-sme" in service.host.groups || "systems-sme" in service.groups

brianmc · August 28, 2020, 11:12am

It’s a good idea, but the definitions are identical, which leads me to believe it’s a problem with the icinga load-sharing at some level.

I installed and configured the system at v2.10, with accept-config=false on primary master. The recommendation now seems to be that this should be true, so I’ll try that.

steaksauce · August 5, 2021, 6:09pm

Was this ever resolved? I am on 2.13.0-1 and notifications are sent for all hosts/services during scheduled downtime.

Downtime is applied via Icingaweb2 (not an apply rule) for maintenance windows by our Network Engineers. Downtime is fixed, not “flexible”, and downtime was sent.

Here is an example:

It does appear that the notification was sent out after the next check interval (20 minutes for this particular check), despite Downtime being scheduled for this service.

I can start a new topic if needed.