thx for looking into this (and sorry for the long delay answer, it took us some time to get the right logs). It’s not reproducible everytime. It happens quite rarely actually.
# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.9.1-1)
Copyright (c) 2012-2018 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Application information:
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Debian GNU/Linux
Platform version: 7 (wheezy)
Kernel: Linux
Kernel version: 4.14.79-xxxx-std-ipv6-64
Architecture: x86_64
Build information:
Compiler: GNU 4.7.2
Build host: 402f99151cf6
The logs:
[2019-08-08 12:18:09 +0000] debug/ServiceDbObject: service dependencies for 'server1.example.com!service1'
[2019-08-08 12:18:09 +0000] debug/ServiceDbObject: service contacts: server1.example.com!service1
[2019-08-08 12:18:09 +0000] debug/ServiceDbObject: service contactgroups: server1.example.com!service1
[2019-08-08 12:19:28 +0000] notice/NotificationComponent: Attempting to send reminder notification 'server1.example.com!service1!ldp'
[2019-08-08 12:19:28 +0000] notice/Notification: Attempting to send reminder notifications for notification object 'server1.example.com!service1!ldp'.
[2019-08-08 12:19:28 +0000] information/Notification: Sending reminder 'Problem' notification 'server1.example.com!service1!ldp' for user 'icingaadmin'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add notification history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add contact notification history for service 'server1.example.com!service1' and user 'icingaadmin'.
[2019-08-08 12:19:28 +0000] information/ExternalCommandListener: Executing external command: [1565266768] PROCESS_SERVICE_CHECK_RESULT;server1.example.com;service1;0;
[2019-08-08 12:19:28 +0000] debug/DbEvents: add log entry history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add checkable check history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add state change history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] notice/Checkable: State Change: Checkable 'server1.example.com!service1' hard state change from UNKNOWN to OK detected.
[2019-08-08 12:19:28 +0000] information/Checkable: Checking for configured notifications for object 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] debug/Checkable: Checkable 'server1.example.com!service1' has 1 notification(s).
[2019-08-08 12:19:28 +0000] notice/Notification: Attempting to send notifications for notification object 'server1.example.com!service1!ldp'.
[2019-08-08 12:19:28 +0000] information/Notification: Sending 'Recovery' notification 'server1.example.com!service1!ldp' for user 'icingaadmin'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add notification history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] debug/DbEvents: add contact notification history for service 'server1.example.com!service1' and user 'icingaadmin'.
[2019-08-08 12:19:28 +0000] notice/Process: Running command '/usr/bin/logger' '--tag' 'icinga-alerts' 'host:icinga1.example.com time:1565266768 type:PROBLEM state:OK server:server1.example.com service:service1': PID 7604
[2019-08-08 12:19:28 +0000] debug/DbEvents: add log entry history for 'server1.example.com!service1'
[2019-08-08 12:19:28 +0000] information/Notification: Completed sending 'Problem' notification 'server1.example.com!service1!ldp' for checkable 'server1.example.com!service1' and user 'icingaadmin'.
[2019-08-08 12:19:28 +0000] notice/Process: PID 7604 ('/usr/bin/logger' '--tag' 'icinga-alerts' 'host:icinga1.example.com time:1565266768 type:PROBLEM state:OK server:server1.example.com service:service1') terminated with exit code 0
[2019-08-08 12:19:29 +0000] notice/Process: Running command '/usr/bin/logger' '--tag' 'icinga-alerts' 'host:icinga1.example.com time:1565266769 type:RECOVERY state:OK server:server1.example.com service:service1': PID 9771
[2019-08-08 12:19:29 +0000] debug/DbEvents: add log entry history for 'server1.example.com!service1'
[2019-08-08 12:19:29 +0000] information/Notification: Completed sending 'Recovery' notification 'server1.example.com!service1!ldp' for checkable 'server1.example.com!service1' and user 'icingaadmin'.
[2019-08-08 12:19:29 +0000] notice/Process: PID 9771 ('/usr/bin/logger' '--tag' 'icinga-alerts' 'host:icinga1.example.com time:1565266769 type:RECOVERY state:OK server:server1.example.com service:service1') terminated with exit code 0
And here are the corresponding configuration sections:
Ok, thanks. That really looks odd, I have never seen this before. Since you’re using 2.9.1 I’d suggest upgrading … but beforehand, Wheezy needs a dist upgrade to at least Jessie, better Stretch and even better Buster
2.10 contains adjustments for notifications, it may be fixed there already.
That’s the plan. We already upgraded one server last week to stretch. The next should be upgraded next week. After that, we will upgrade icinga. I’ll keep this thread up-to-date to tell if it’s fixed or not.
One thing that bother me is from the log, in this order:
[2019-08-08 12:19:28 +0000] information/Notification: Sending reminder 'Problem' notification 'server1.example.com!service1!ldp' for user 'icingaadmin'
...
[2019-08-08 12:19:28 +0000] notice/Checkable: State Change: Checkable 'server1.example.com!service1' hard state change from UNKNOWN to OK detected.
...
[2019-08-08 12:19:28 +0000] notice/Process: Running command '/usr/bin/logger' '--tag' 'icinga-alerts' 'host:icinga1.example.com time:1565266768 type:PROBLEM state:OK server:server1.example.com service:service1': PID 7604
Do you think it can just be an “atomicity” issue? Like