Hello,
We’re currently running a hybrid monitoring environment using Icinga and SCOM, where Icinga is monitoring all Unix- and networkdevices and Operations Manager watches over all Windows based devices and applications.
However, due to the fact, that SCOM is a bit too bloated for our needs, we want to switch to full Icinga monitoring, using a combination of the Icinga2 Agent with the powershell-framework and -plugins.
We’re not using Director. The Framework and Plugins are v1.4.0.
Following the documentation we can now invoke checks through the agent and getting a decent output.
But one thing is keeping us from rolling this out on a big scale.
Take our network interface check as a reference:
& ‘C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe’ -ExecutionPolicy ‘ByPass’ -C { Use-Icinga; $counterset = get-counter -ListSet ‘network interface’ | Select-Object -ExpandProperty pathswithinstances | where {$_ -inotmatch ‘isatap’}; Invoke-IcingaCheckPerfcounter -PerfCounter $counterset -Warning $null -Critical $null -Verbose ‘0’}
This check is run every 60 seconds and gives us coherent results.
However, every time it is run, it also creates a new powershell instance which, natively, takes a lot of time to start up. This means, that this check takes between 8 to 12 seconds in execution time, resulting in about 20-25% CPU in that timeframe. I discovered, that the CPU load isn’t bound to the complexity of the check. A simple Invoke-IcingaCheckCPU may only take 1 second to run, but it also takes the full 25% CPU.
We now have the phenomenon that sometimes some checks are overlapping, so that e.g. our Exchange 2016 Server sits there with 80% CPU load most of the time, where 50-75% of is powershell.
Disabling the Icinga service results in absence of these processes.
I now wonder:
Is there some kind of best practice I am missing or am I doing it entirely wrong?
When the command mentioned above is run multiple times in a single Powershell session, the runtime can be reduced from 8 to 12 sec to around 2.
Can I manipulate the behaviour of the agent in a way, that it keeps one powershell session an executes its checks through that single session?
We really want to fully migrate into the Icinga universe but blocking our resources with 5 or so checks is not really formidable.
Thank you for your help!