Do you - and how do you use eventhandlers?

In our ask me anything series @nilmerg answered a question about Icingas eventhandlers, where he stated that there isn’t really a known use for them from his perspective.

@MarcusCaepio mentioned that their experience differs and suggested just asking you all for your experiences.

I would love to hear from all of you, how you use the eventhandlers - or if you don’t.

2 Likes

I’ll start (after not having thought about eventcommands for some time now ^^)

Currently there are no event commands/handlers configured on the systems I have set up or attend to.
I do not find them useless, but no customer has asked for some automatic action to be taken when something “breaks” until now.
Though, after talking to colleagues, there could be a need for them in future and could be implemented.
If I understand correctly the AMA video is referring to the visual representation of configured eventhandlers/eventcommands in the webinterface, correct?
I would put that under “nice to have”, so that e.g. ops could see the configured event options and the command.

1 Like

Predating myself at my current employer, we are using monit (watchdog daemon) for “event handling” on some things. Not sure why they didn’t use event handlers in Icinga but as we move forward into Alma8 it looks like we are shifting away from monit so I will likely be using them more.

I’m using one event command to write messages in Windows event log.

We’ve started a new project where we’ll heavily use event handlers to automate tasks e.g. run cleanup scripts, restart (depending) services, trigger application jobs etc.

1 Like

Back in Icinga-1 days, we used event handlers to put a short downtime on services when a host recovered to avoid additional notifications. For Icinga 2 i am still hoping that the following pull request get merged Introduce a recovery_time attribute for checkables. by efuss · Pull Request #8323 · Icinga/icinga2 · GitHub.

Other than that, I see some use cases like:

  • Restart of services
  • Automatic enlargement of hard disks for VMs

But these examples represent only a very small part of the failures.
in general, the failures are too diverse to be handled by event handlers.

1 Like

I use one to restart Icinga2 to kill zombies started via sudo and which cannot be killed normally.

The alternative solution seems to have consensus that it is not such a great idea (primarily for security reasons):

I have not seen or thought of a third solution.

Thanks @theFeu for the thread.
At my previous employer, event handlers were used extensively.
Currently I am using event handlers to restart services. E.g. I had problems with the sfcbd-watchdog service on ESXi while using check_esxi_hardware in the past. The service had to be restarted to get the check running again, what easily could be done with a handler.
More cases could be mapped in the future, such as restarting Docker containers or other services.

As event handler notifications are not present yet, I have to send them in the handler by myself (sending it to a MS Teams channel). My wishes here would be:

  • See in Icingaweb2, if an EventHandler is configured. I know, this is hard to solve, as only EventCommands, but not the EventHandler themself are shown in the icinga2 object list
  • See (in icingaweb2) / get / configure notifications for a triggered EventHandler as same as I configure all the other notifications.

This would be nice to have features of course. I guess your focus is currently in IcingaDB etc. But imho event handlers are whether ancient nor less used. And it seems like this thread is proving it so far :smiley: I could image it is not used so much, because the configuration is quite tough compared to the rest of the icinga configuration.

1 Like

Hi guys,

what @MarcusCaepio wrote is what I thought, too. We use it very much in our environment and (until now) I can not think of an alternative way to solve some of our “problems” withou eventhandlers.

I just commented beneth the video from Icinga / nilberg when it was released. But somehow my comment was deleted.

But I try to recreate it from my memories.

For our team and many of our customers eventhandlers are very important.
Unfortunately some customers use quite stupid and instable software that can not run and live without being continously restarted when certain errors occur. In those cases we use some self written eventhandlers.
Like when a specific error occurs or port is down which we all monitor, an eventhandler kicks the server itself or the software and restarts it.
We also use them to acknowledge a host or a service automatically based on certain service states or outputs.

We migrated most of them from icinga1 to icinga2 so it might have grown historically.
But at least until now we do not know of an alternative way to solve these problems.
Maybe you could enlighten us in the dark :slight_smile:

I love events :slight_smile:

We use them to run some PostgreSQL tasks when needed. Specifically:

  • Run vacuum and vacuum analyze on DB’s when a certain time has passed without any vacuum.
  • Reindex specific indexes when they become bloated.
  • Kill transactions idle for more than a week. (I’m allowed to do it :wink: )

There are other situations where I plan to continue implementing events, just lacking the time to work on them.

Cheers

1 Like

But still today, event handlers are executed on every node in a icinga cluster, when I remember correctly. So in a cluster with several zones, where Eventhandlers should only be executed on specific satellites, it is not usable without problems…

Afaik the eventhandler is executed on the node which ran the check command for the check. Could be the satellite or the agent host, or even the master.

1 Like

You are right. I forgot that it was fixed. Had this in mind [dev.icinga.com #10208] Eventhandler trigger on all endpoints in high available zone · Issue #3431 · Icinga/icinga2 · GitHub