Windows Server 2019 sometimes crashes with the icinga2 agent

Hi,

I have some virtual Windows Server 2016/2019 monitored by Icinga2 Windows Agent.

Sometimes Windows Server crashes with a blue screen. On a single server it happens once a month, more rarely several times.

I’m collecting memory dumps (MEMORY.DMP) and the common error is IRQL_NOT_LESS_OR_EQUAL (a)
but PROCESS_NAME is always an icinga process:

  • check_users.exe
  • check_load.exe
  • check_uptime.exe
  • check_procs.exe
  • check_service.exe
    or a script powershell
  • powershell.exe

The ram on the hypervisor has no errors.

My client

  • VM Windows 2019 run on varius phisical machine.
  • Windows 2019 all with the latest patch
  • Icinga2 Agent 2.13.7
  • No NSClient++
  • Several plugin custom via powershell.

I already reduced the frequency of checking and added the icinga2 bin folder exclusion to the antivirus real-time scan.

Note: With a vm Windows 2012 vm server I have no problems.

Anyone else having similar problems with windows server monitors?
Any suggestions on where to investigate?

Thank you
Michele

Obtaining a memory dump would be very helpful: Collecting User-Mode Dumps - Win32 apps | Microsoft Learn

These can then be analyzed with Visual Studio or WinDbg. Visual Studio has the less annoying interface, but I also had dumps where I could only get stack traces with WinDbg, so it might be worth trying both.

Please try to update your icinga client on the windows systems.

The icinga team has some fixes like: Retry file rename operations on Windows for some errors by julianbrost · Pull Request #8691 · Icinga/icinga2 · GitHub