Reporting of what process is using memory?

Before I start re-inventing the wheel…

I’m trying to find a way to know what processes are using memory whenever a memory threshold is triggered. Ideally this information would be part of the memory check and not a separate check altogether. There are quite a few checks out there for both *nix and Windows for memory utilization, but I can’t seem to find any that tell you what is actually using the excessive amount of RAM. When you already know the process name or pid, you can craft checks to monitor the info on that particular process, but there’s always a chance of an unknown process using the memory.

Has anyone else tackled this issue? If so how? How are you storing the historical memory usage per process? Are you able to reference this in any graphs or reports? Ideally I’d like to have a mouseover in grafana that shows the top n processes when hovering over a memory usage graph.

Just thinking out loud here…the issue is that everything I have is mostly saved in a TSDB, everything from the checks at least. Maybe some perfdata output is stored in the icinga2 DB…not sure. But for the historical process info, that stuff would need to be put into something that can handle strings…so not graphite/whisper, something meant to handle event logging. I don’t know, just hoping somebody else has handled this. :slight_smile:

1 Like

…and ditto all of that for CPU and IO usage.

I have used for top 5 CPU process when there is a problem

Have you tried check_procs?

 -m, --metric=TYPE
  Check thresholds against metric. Valid types:
  PROCS   - number of processes (default)
  VSZ     - virtual memory size
  RSS     - resident set memory size
  CPU     - percentage CPU
  ELAPSED - time elapsed in seconds
check_procs -w 50000 -c 100000 --metric=VSZ
  Alert if VSZ of any processes over 50K or 100K

 check_procs -w 10 -c 20 --metric=CPU
  Alert if CPU of any processes over 10%% or 20%%
1 Like