Testing notifications on active checks?

peter · July 20, 2019, 7:07pm

I’ve taken over mid-implementation of a system with several notification requirements on a single alert (i.e. post to slack, victor-ops, email), and I’m trying to test notifications. I thought I’d be able to simply submit a passive check result to trigger notifications, but either that does not work for active checks, or my notifications are not working. Curious if there is a good method to manually trigger alert states so that I can test notifications during configuration.

Thanks!

winem · July 22, 2019, 7:07am

Are both set to true? enable_passive_checks and enable_active_checks? And how do you pass the results to icinga? Via the command file or the API?

I think you should see something in the icinga2.log in both cases. It should say if the service or host is not known or the result is somehow malformed.

dnsmichi · July 22, 2019, 7:31am

I’m typically creating a dummy host/service with using the dummy check command. This allows for two purposes:

Send in check results via API
Do some calculations inside a lambda function in vars.dummy_text which help debugging the state and have that visible in Icinga Web 2 or the REST API

The check_interval and retry_interval values are somewhat high (1h for example) to allow manual testing with actual check results.

If max_check_attempts is not set, this is 3 by default, therefore I need to feed in 3 times critical via REST API action `process-check-result in order to trigger a HARD state change and notifications.

Inside the notification scope, there’s

a notification rule
a dummy user with a real email address
a notification command which triggers a script which “does something”. In order to see which values are available, the example mail notification script is sufficient. Maybe with a modification to dump all parameters in ARGV or add my own.

Once the notification is triggered, I follow the flow and trace it inside the debug log. Search for the checkable’s name, that’s the easiest way.

There’s some things to check as shown in the troubleshooting docs:

no hard state, investigate on here
notification but states/types filter don’t let it through
notification is paused in a HA setup, switch to the secondary master and trace it there
notification gets triggered but parameters to the script are missing - verify the command and then the script for these parameters.
notification is sent for user A, but not user group B - check the applied notification rules with object list and the corresponding notification objects.

And so on. That’s also my workflow when testing new Icinga releases or a PR fixing something.

Cheers,
Michael

peter · July 24, 2019, 8:15pm

interesting, thank you. I’ve been able to leverage a dummy alert and then toggle by adding/removing acknowledgement via livestatus (we use thruk as a front end.)