Monitoring Resource Utilization of Agents

davar1844 · May 9, 2019, 3:18pm

Hello Team,
I’m seeking your insights on alternatives or maybe existing solutions for monitoring the resource utilization of the group of icinga agents on a node/server.
The scenario is that I have multiple agents running on each node collecting performance data for CPU and memory utilization and other metrics. But I want to know the collective CPU and memory utilization of my icinga agents.
And in a way this can be extended to other processes too. For example, knowing the collective CPU and memory utilization of back up processes on each node.
Are there inherent features withing Icinga for doing this?
There is an existing thread under “HowTo > Monitoring the monitoring” [Monitoring the monitoring]
But I couldn’t find what I was looking for there.

Much appreciation for your insights in advance.

dnsmichi · May 9, 2019, 3:26pm

I’m not quite sure if I can follow. You have multiple processes of icinga2 started on a single node, and all collect the same metrics? Apart from that such a scenario not being supported officially, what’s the intention behind that? Sounds complicated and redundant to me.

Cheers,
Michael

davar1844 · May 9, 2019, 4:01pm

Hi Michael,
Sorry for the confusion. There is only one agent on the node. But separate plugins for collecting CPU and memory metrics. The intention is to efficiently identifying the CPU and memory utilization for the plugins so that we would know how much CPU and memory is being used for actually running the plugins. In a way it is monitoring of the monitoring plugins.
This helps with troubleshooting and early identification of potential runaway processes associated with the plugins.
I hope this clarifies the scenario.
Thanks again.

dnsmichi · May 9, 2019, 4:04pm

Then the important question - Linux or Windows?

davar1844 · May 9, 2019, 5:39pm

Linux - RHEL 6/ 7 - is the first priority and then Windows will follow

log1c · May 10, 2019, 7:18am

Something like check_snmp_process.pl (@dnsmichi has a updated version on his github) maybe?

Checks by snmp v1 or v3 if a process is running and how many instances are running (minimum & maximum).
It is also possible to check memory and cpu used by one or a group of process

dnsmichi · May 10, 2019, 7:26am

I’m not actively maintaining those manubulon plugins anymore, I just don’t have the time to do “everything”. That’s noted in their docs already.

In terms of plugin runtime and performance usage - Icinga forks a process for executing a check. You can monitor the runtime execution of such via ps, and try to peek into what’s happening in there.

I doubt though that you can do application performance monitoring in that way, or do runtime profiling. That is something you really should wrap around the plugin call, e.g. measuring the execution time, calls to malloc(), opened file descriptors, etc.

Yet I am not sure what the overall benefit will be. Bad plugins or performance problems should be tested in a staging environment where you exclusively run them in sandboxes and test their performance.

Cheers,
Michael

log1c · May 10, 2019, 7:46am

You still did a good job updating them
I just like using the most up-to-date version of anything I can find, unless the older version works better.