Hi, I have an issue trying to monitoring the health and latency of Internet service from a cluster of Linux Gateways.
Only one of the linux servers has the public IP. An active-backup cluster.
First I have tried to monitor both GW using Icinga2 agent to run local nagios plugins scripts (check_icmp) but I found a problem. To do the job I have to point to a private floating IP, that is managed by Keepalived. And as a read, that’s not possible in this scenario.
So I have choose using SNMP to query a custom OID and then apply the service to the floating IP. So no matter which one has the public IP, Icinga2 Server will reach it.
That’s where the problem is. The output of the check_snmp seems to be ok, but Icinga2 server logs shows an error on perfdata and I cant graph the results.
Config on Icinga SNMP Agent
/etc/snmp/snmpd.conf
extend int_sal /usr/lib/nagios/plugins/check_icmp -s x.x.x.x -c 200,15% -w 100,5% -H google.com
Running mannualy on Icinga SNMP Agent
/usr/lib/nagios/plugins/check_icmp -s x.x.x.x -c 200,15% -w 100,5% -H google.com
OK - google.com: rta 3,224ms, lost 0%|rta=3,224ms;100,000;200,000;0; pl=0%;5;15;; rtmax=3,298ms;;;; rtmin=3,195ms;;;;
Icinga Server Config
Service
object Service "Internet Saliente" {
host_name = "host.example.com"
check_command = "snmpv3"
max_check_attempts = "3"
check_interval = 3m
retry_interval = 1m
enable_notifications = true
enable_active_checks = true
enable_passive_checks = true
enable_event_handler = true
enable_flapping = true
enable_perfdata = true
vars.snmp_v3 = true
vars.snmpv3_auth_alg = "md5"
vars.snmpv3_auth_key = "Password"
vars.snmpv3_oid = ".1.3.6.1.4.1.8072.1.3.2.3.1.2.14.105.110.116.95.115.97.108.95.99.108.97.114.111.50"
vars.snmpv3_seclevel = "authNoPriv"
vars.snmpv3_user = "User"
}
Icinga Server debug log
tail -f /var/log/icinga2/debug.log
[2023-03-27 12:08:24 -0300] notice/Process: Running command '/usr/lib/nagios/plugins/check_snmp' '-A' 'Password' '-H' 'host_IP' '-L' 'authNoPriv' '-P' '3' '-U' 'User' '-a' 'md5' '-o' '.1.3.6.1.4.1.8072.1.3.2.3.1.2.14.105.110.116.95.115.97.108.95.99.108.97.114.111.50' '-t' '10' '-x' 'AES': PID 1682967
[2023-03-27 12:08:24 -0300] notice/Process: PID 1682967 ('/usr/lib/nagios/plugins/check_snmp' '-A' 'Password' '-H' 'host_IP' '-L' 'authNoPriv' '-P' '3' '-U' 'User' '-a' 'md5' '-o' '.1.3.6.1.4.1.8072.1.3.2.3.1.2.14.105.110.116.95.115.97.108.95.99.108.97.114.111.50' '-t' '10' '-x' 'AES') terminated with exit code 0
[2023-03-27 12:14:52 -0300] warning/GraphiteWriter: Ignoring invalid perfdata for checkable 'host.example.com!Internet Saliente' and command 'snmpv3' with value: rta=3,238ms;100,000;200,000;0;
[2023-03-27 12:14:52 -0300] debug/GraphiteWriter: Checkable 'host.example.com!Internet Saliente' adds to metric list: 'icinga2...host.example.com.services.Internet_Saliente.snmpv3.perfdata.pl.value 0 1679930092'.
[2023-03-27 12:14:52 -0300] debug/GraphiteWriter: Checkable 'host.example.com!Internet Saliente' adds to metric list: 'icinga2...host.example.com.services.Internet_Saliente.snmpv3.perfdata.pl.crit 15 1679930092'.
[2023-03-27 12:14:52 -0300] debug/GraphiteWriter: Checkable 'host.example.com!Internet Saliente' adds to metric list: 'icinga2...host.example.com.services.Internet_Saliente.snmpv3.perfdata.pl.warn 5 1679930092'.
[2023-03-27 12:14:52 -0300] warning/GraphiteWriter: Ignoring invalid perfdata for checkable 'host.example.com!Internet Saliente' and command 'snmpv3' with value: rtmax=3,272ms;;;;
[2023-03-27 12:14:52 -0300] warning/GraphiteWriter: Ignoring invalid perfdata for checkable 'host.example.com!Internet Saliente' and command 'snmpv3' with value: rtmin=3,213ms;;;;
Running the command manually from Icinga Server
'/usr/lib/nagios/plugins/check_snmp' '-A' 'Password' '-H' 'host_IP' '-L' 'authNoPriv' '-P' '3' '-U' 'User' '-a' 'md5' '-o' '.1.3.6.1.4.1.8072.1.3.2.3.1.2.14.105.110.116.95.115.97.108.95.99.108.97.114.111.50' '-t' '10' '-x' 'AES'
SNMP OK - "OK - google.com: rta 3,279ms, lost 0%|rta=3,279ms;100,000;200,000;0; pl=0%;5;15;; rtmax=3,345ms;;;; rtmin=3,240ms;;;; " |
I could not figure why Icinga2 cannot interpret the check_snmp output.
Thanks in advance!!!