How do you monitor your switch-stacks via SNMP?

gkasKaR · February 17, 2022, 8:21am

Hello Icinga-Communitiy,

since this is my first post, I beg your pardon for sake of form or something related to this

As the caption mentions, I am interested in best-practices for monitoring switch-stacks.
The check_nwc_health plugin works totally fine, but not when it comes down to monitoring a stack of switches - at least with Huawei Cloud Engine switches.

As you see, the load and memory checks are working properly.
The SNMP Health check (-- mode hardware_health) delivers an unknown state.

Do you have any ideas how to get the hardware_health information?
I really would like to stick to one SNMP-Plugin and don’t mix them.

Thanks

ShowMeYourSkil · February 19, 2022, 10:34am

Good day, which SNMP version are you working with? I can imagine that you have made a mistake in your SNMP configuration on the master or the switch.

stevie-sy · February 21, 2022, 6:44am

Hi and welcome,

do you use check_nwc_health also for your other checks (SNMP Load, SNMP Memory)?
It would also help a lot if you show us your service definition

gkasKaR · February 21, 2022, 8:40am

Hello,

I use SNMP v3.
Here are the checks executed via shell.
The upper two checks are asking the stack, the other two a single switch.

Thank you

stevie-sy · February 21, 2022, 9:24am

Interesting that the other modes works fine. only the mode for hardware-health fails.

You could try follwoing:

for the mode hardware-health use another timeout with the param --timeout.
run a snmpwalk to the specific oids. The oids should be included in the script. What happens here?

gkasKaR · February 21, 2022, 10:03am

Hello Stevie,

the timeout hint made it - I set the value to 60.
Obviously the check needs too much time to check five switches.

Thanks for your help