I’ve taken over mid-implementation of a system with several notification requirements on a single alert (i.e. post to slack, victor-ops, email), and I’m trying to test notifications. I thought I’d be able to simply submit a passive check result to trigger notifications, but either that does not work for active checks, or my notifications are not working. Curious if there is a good method to manually trigger alert states so that I can test notifications during configuration.
Are both set to true? enable_passive_checks and enable_active_checks? And how do you pass the results to icinga? Via the command file or the API?
I think you should see something in the icinga2.log in both cases. It should say if the service or host is not known or the result is somehow malformed.
I’m typically creating a dummy host/service with using the dummy check command. This allows for two purposes:
Send in check results via API
Do some calculations inside a lambda function in vars.dummy_text which help debugging the state and have that visible in Icinga Web 2 or the REST API
The check_interval and retry_interval values are somewhat high (1h for example) to allow manual testing with actual check results.
If max_check_attempts is not set, this is 3 by default, therefore I need to feed in 3 times critical via REST API action `process-check-result in order to trigger a HARD state change and notifications.
Inside the notification scope, there’s
a notification rule
a dummy user with a real email address
a notification command which triggers a script which “does something”. In order to see which values are available, the example mail notification script is sufficient. Maybe with a modification to dump all parameters in ARGV or add my own.
Once the notification is triggered, I follow the flow and trace it inside the debug log. Search for the checkable’s name, that’s the easiest way.
There’s some things to check as shown in the troubleshooting docs:
no hard state, investigate on here
notification but states/types filter don’t let it through
notification is paused in a HA setup, switch to the secondary master and trace it there
notification gets triggered but parameters to the script are missing - verify the command and then the script for these parameters.
notification is sent for user A, but not user group B - check the applied notification rules with object list and the corresponding notification objects.
And so on. That’s also my workflow when testing new Icinga releases or a PR fixing something.
interesting, thank you. I’ve been able to leverage a dummy alert and then toggle by adding/removing acknowledgement via livestatus (we use thruk as a front end.)