I work for an MSP and we use Icinga2 internally to monitor customer devices, including all workstations/notebooks.
~100 Notebooks are configured with one-way connectivity. The intention is that supported devices can communicate back to a WAN IP from anywhere.
I have configured the host check to be the built-in cluster-zone command. The hosts will almost never be pingable, a traditional ping host check wouldn’t work. All services are dependent on “Agent Health” where the command is cluster. I have enabled Active and Passive Checks, Notifications, and Event Handler. While devices are online, this works quite well.
The problems start when the device sleeps or is coming back online. We will immediately receive a notification for every service on the host right before they all recover. Sometimes the problem notifications and recoveries come at the same time. The issue is creating alarm fatigue in our techs ![]()
Can anyone explain why this happens and how I can implement this without these notifications?
I’d really like to know if the “not connected” messages are from the notebook agent or from the satellite/parent. I need the statistics, and it must produce notifications for things like 100% full disks while users are working from home. I don’t care if we’re not getting results because the device is offline.
Thanks for any help!
