Icinga-Powershell-Plugins: Questions to Invoke-IcingaCheckEventlog

Hello at all,

During our implementation and tests with Icinga-PowerShell Plugins (Installation according to the instructions) we got the request to monitor also the Windows EventLog. So we configured the check “Invoke-IcingaCheckEventlog”.
We came up with the following problems, whereby we would like to ask you about your experience. Because we may have a mistake in thinking.

  • After installing the plugins and without using the switch “DisableTimeCache” the checks doesn’t work. Icinga throws every time a permission error for the cache directory. With this switch everything works fine. Maybe this was a result of our server setups and the set permissions from our server admins. .

  • What we also realized while using the switch “DisableTimeCache” was, that it really makes no sense using this. Let’s assume we have a check interval from 5 min. During this interval a program writes a message into the eventlog and the check throws warning/critical. After the next check interval the check is ok again, because in this interval there no new event was written into the log.

  • So checking the hole eventlog also makes no sense. Because therefore we have to know how many log entries are “normal”. Or our colleagues have to delete the event after fixing the problem. In this case it makes a log obsolete.

  • Trying the switch “After” improves the situation a little bit. With the Icinga-DSL (using var dt = DateTime() - 24 * 60 * 60; return dt.to_string() ) we can create a timestmap like now - 24h. But our office is closed on weekends and public holidays. So if there was a event written, we don’t get it via Icinga the next working day that something was there. Here we would first have to check the logs from all servers again or expand the time-range.

  • The next idea would be that icinga have to stop to check the eventlog if it’s getting critical. After fixing the problem we have to set the check manually to ok. But here is the problem, if there would be another event written.

  • Another possibility would be that every server bump the full eventlog from every server to our ELK stack. However, we see similar problems like written above if checking this log.

So what do you think? What is your experience?

Thank you

@cstein How about this? Because of helping in an other thread. Do you have here also some ideas/solutions/workarround? Would be very nice. Thank you

We are using the below definition

apply Service "XXX-P_MS_SQLEvtLogID" {
    import "XXX-tmplService-MS-PS_EventLog"

    assign where "XXX-tmplHost-MS" in host.templates
    vars.IcingaCheckEventlog_Array_IncludeEventId = [
    vars.IcingaCheckEventlog_Int32_Verbosity = "2"
    vars.IcingaCheckEventlog_Object_Warning = "~:0"
    vars.IcingaCheckEventlog_String_LogName = "Application"
    vars.IcingaCheckEventlog_Switchparameter_DisableTimeCache = false

    import DirectorOverrideTemplate

The only issue we feel that is there is we may be missing some alerts when multiple same eventID is written at the same time. But not very sure as we can’t reproduce it.

Lets say 1101 and 3201 is written in same sample we have seen both eventID mentioned in the alert.

Now we don’t have a test case where 1101 is written in 1 sample and alerted - in the next sample we have 3201. Whether the event will update the description of 1101 and change it to 3201. Don’t know.

Thanks for your example. I will add those parameters from your example that are missing to my service definition.Let’s see what happen