Notifications not working for two of my satellites

Hello community,

at work we have a setup that consists of

  • 2 Master Nodes with Director (zone==master)
  • 5 Satellites (zone==fqdn)
  • lots of Clients (zone==fqdn)

I upgraded the software on all members to 2.11.3. The notification feature is disabled on all but the master nodes.

We are using a custom notification script that creates tickets via REST API in our ticketsystem. For two of my satellites and their children the notifications are not working. The other ones are forwarding their notifications to one of the master nodes and the script is being called just fine.

I looked inside the Analyze Notification Results paragraph in the documentation, but I cannot get the Icinga API to output anything as “last_notification_result”.

It just outputs:

{
    "error": 400.0,
    "status": "Invalid field specified: last_notification_result"
}

What can I do to display/further debug why no notifications are being relayed to my masters?

PS: The custom notification script written in python logs to syslog facility 1 when being called with log.info("Notification was triggered with the following arguments: %s" % (sys.argv)) but for those two satellites I don’t get these lines which means the notification script on the master is not even being called.

The debug.log only prints lines like:

grep -ir notification /var/log/icinga2
/var/log/icinga2/debug.log:[2020-03-30 11:37:07 +0200] notice/JsonRpcConnection: Received 'event::SetForceNextNotification' message from identity 'master-2'.
/var/log/icinga2/debug.log:[2020-03-30 11:37:07 +0200] notice/ApiListener: Relaying 'event::SetForceNextNotification' message
Mehr Info Notification Template:
template Notification "template-service-notification-otrs-customername" {
    command = "service-otrscreation"
    interval = 0s
    states = [ Critical, Warning ]
    types = [ Custom, Problem ]
    users = [ "icinga_customername" ]
    vars.otrs_customer_user = "icinga_customername"
}

Notification Applyrule:

apply Notification "service-notifications-customername" to Service {
    command = "service-otrscreation"
    interval = 0s
    assign where "customername in host.groups
    states = [ Critical, Warning ]
    types = [ Custom, Problem ]
    users = [ "icinga_customername" ]
    vars.otrs_customer_user = "icinga_customername"
}

Notification Command:

object NotificationCommand "service-otrscreation" {
    import "plugin-notification-command"
    command = [ "/usr/lib64/nagios/plugins/create_otrs" ]
    timeout = 2m
    arguments += {
        "--type" = {
            repeat_key = false
            required = true
            value = "2"
        }
        "-C" = {
            repeat_key = false
            required = false
            value = "$host.vars.otrs_customer_user$"
        }
        "-K" = {
            repeat_key = false
            required = true
            value = "$host.vars.otrs_id$"
        }
        "-T" = {
            repeat_key = false
            required = true
            value = "$service.display_name$ is $service.state$ on $host.display_name$"
        }
        "-b" = "$service.output$"
        "-c" = {
            repeat_key = false
            required = false
            value = "/etc/icinga2/otrs.ini"
        }
        "-q" = {
            repeat_key = false
            required = true
            value = "5"
        }
        "-s" = {
            repeat_key = false
            required = true
            value = "13"
        }
    }
}

Here’s what sudo icinga2 object list -t notification gives me (example):

Object 'NX-101.customername.local!Cisco Spanning Tree!service-notifications-customername' of type 'Notification':
  % declared in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57
  * __name = "NX-101.customername.local!Cisco Spanning Tree!service-notifications-customername"
  * command = "service-otrscreation"
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 89:5-89:36
  * command_endpoint = ""
  * host_name = "NX-101.customername.local"
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57
  * interval = 0
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 90:5-90:17
  * name = "service-notifications-customername"
  * package = "director"
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57
  * period = ""
  * service_name = "Cisco Spanning Tree"
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57
  * source_location
    * first_column = 1
    * first_line = 104
    * last_column = 57
    * last_line = 104
    * path = "/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf"
  * states = [ "Critical", "Warning" ]
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 91:5-91:34
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 108:5-108:34
  * templates = [ "service-notifications-customername", "template-service-notification-otrs-customername" ]
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 88:1-88:62
  * times = null
  * type = "Notification"
  * types = [ "Custom", "Problem" ]
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 92:5-92:31
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 109:5-109:31
  * user_groups = null
  * users = [ "icinga_customername" ]
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 93:5-93:28
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 110:5-110:28
  * vars
    * otrs_customer_user = "icinga_customername"
      % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_templates.conf', lines 94:5-94:42
  * zone = "monitoring.customername.local"
    % = modified in '/var/lib/icinga2/api/packages/director/4656a8d5-a0a3-4a83-880a-5e64a4e9f483/zones.d/master/notification_apply.conf', lines 104:1-104:57

Hi and welcome,

did you check the icinga logfile on both masters for error messeages and if there is really the notification script is triggered on a notification event of the service? On a HA setup you never know which of the master sends out the notification, so you have to check both masters. If this doesnt help, enable debug log on both masters and trigger a notification event by setting a service/host to hard critical state. Debug shows a lot more then normal log, but dont forget to turn it off as soon you have done your tests.

Regards,
Carsten

1 Like

When running the so-called event stream to debug notifications
curl -k -s -u root:icinga -H 'Accept: application/json' -X POST 'https://localhost:5665/v1/events?queue=debugnotifications&types=Notification, I see the following on both masters:

{"author":"","check_result":{"active":true,"check_source":"monitoring.customername.local","command":["/usr/lib64/nagios/plugins/check_disk","-c","10%","-w","20%","-X","none","-X","tmpfs","-X","sysfs","-X","proc","-X","configfs","-X","devtmpfs","-X","devfs","-X","mtmfs","-X","tracefs","-X","cgroup","-X","fuse.gvfsd-fuse","-X","fuse.gvfs-fuse-daemon","-X","fdescfs","-X","overlay","-X","nsfs","-X","squashfs","-m"],"execution_end":1585813563.481116,"execution_start":1585813563.472071,"exit_status":1.0,"output":"DISK WARNING - free space: / 3970 MB (69.49% inode=98%); /boot 697 MB (78.33% inode=99%); /home 1865 MB (98.26% inode=100%); /var 3526 MB (92.66% inode=100%); /var/log 310 MB (16.33% inode=100%); /tmp 3773 MB (99.15% inode=100%); /var/cache 3629 MB (95.36% inode=100%); /var/log/audit 888 MB (93.23% inode=100%);","performance_data":["/=1743MB;4571;5142;0;5714","/boot=192MB;749;843;0;937","/home=32MB;1518;1708;0;1898","/var=279MB;3044;3425;0;3806","/var/log=1587MB;1518;1708;0;1898","/tmp=32MB;3044;3425;0;3806","/var/cache=176MB;3044;3425;0;3806","/var/log/audit=64MB;761;856;0;952"],"schedule_end":1585813563.481301,"schedule_start":1585813563.481301,"state":1.0,"ttl":0.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":1.0,"state_type":1.0},"vars_before":{"attempt":4.0,"reachable":true,"state":1.0,"state_type":0.0}},"command":"service-otrscreation","host":"monitoring.customername.local","notification_type":"PROBLEM","service":"Linux Disks [IAS]","text":"","timestamp":1585813558.52075,"type":"Notification","users":["icinga_customername"]}

Hi,

so the masters triggers a notification. What do you see in the logfiles? Problems with the scripts (errors) you will not see in the event stream.

Regards,
Carsten

1 Like

Thanks, I found in icinga2.log that one parameter of my notification script was missing.

2 Likes

Still I would like to know why the new feature introduced in 2.11 to Analyze Notifications is not working for me.

Do really all endpoints have to be 2.11? Currently we have over 200…