Windows Process Monitoring with icinga icinga-Powershell-Plugins

Hi guys,

Our windows server admins installed the last weeks the Icinga agents on our windows servers (very often instead or parallel to SCOM). Successively we get more and more requirement besides to some base checks (like cpu, memory, disk) to monitor more and more. This means that SCOM can be completely replaced on some servers.

Before we start to write our own scripts we want to use what already exists like the checks icinga installs automaticallyor the new icinga-powershell-plugins. We don’t want to use nsclient again.

One of the new requirements is to monitor processes also next to the windows services. Monitoring windows services works fine. The background is - as you can imagine - that some services are starting some processes like Java. The service can still running but the process could be killed for some reason.
As noted in old threads the check_procs.exe only counts the process. We wanted to give Invoke-IcingaCheckProcessCount a chance and expected a similar behavior like check_procs command on linux from the monitoring-plugins. But at my first tests the result was diffrent:

I tested this with Outlook. The task name is “OUTLOOK.EXE”. If you run the check with the processname “outlook” (which is not running) or the real name from the task manger - the threshold values are the base value from the check - the check result is always 0 (OK)

If I set the warning threshold to 0 with the processname “outlook” the check returns still 0 (OK) and with “OUTLOOK.EXE” the check returns 1 (WARNING)

image

For my understanding the check should minimum warning (better critical) if the process to check is not running - like the check from the monigoring plugin. Because if I want to check a process I’m interested if this one is running or not. And if the process is forking itself many many times because of some problems.

My server admin colleagues don’t know a use case only for counting processes if this check don’t react if the passed processname is not running. Or do we understand something wrong? Or do we use this check in a wrong way?

For us it would of course be fine if the deliverd check from the powershell-plugins provided the same/a similar result/behavior like the linux pendant.

Hi,

as far as I can tell this check only counts the processes with the given name. I looked at source code, I didn’t understand everything but I couldn’t find a hint that there is something implemented to check if a process is not running.

This is possible with the check.

Maybe you can file a feature request over here since I think this is a feature other also benefit from

https://github.com/Icinga/icinga-powershell-plugins/issues

Best regards
Michael

Hi Michael,

thanks for the fast answer :slight_smile:

I’ll ask our powershell guru. Powershell is not my thing and I don’t like it either :grin: I think he will understand the code more than me and has ideas. :wink: After we can do a feature-request or if he has a solution to extend to code to have a similar behavior like with the check_procs from the monitoring-plugins we could also do a merge-request. We will see 


But as I wrote it would be nice if the check would have out-of-the-box the same/similar result than the linux one. Even our Windows Server admins doesn’t understand the sense of this check only count the number of running processes whithout checking if the process is really running.

I’m not sure if this plugin follows the threshold specification but In theory you should be able to set your warning/critical threshold to ‘less than 1’ - see https://nagios-plugins.org/doc/guidelines.html#THRESHOLDFORMAT.

e.g.

OK = 1
CRIT = :1

That was what the expected behaviour what we know from the monitoring-plugins, but no :wink:

If you want the check to be ok when 1 process is found and critical if less than one (no processes) are found the threshold for critical should be 1:

The plugins support all other threshold ranges as well:

3 Likes

Thanks, that is the missing link (y).

With this informaiton it works as expected and it is that behaviour we know from the monitoring-plugins under Linux.

image

1 Like