Icinga2 Agent, Cluster, floating IP

I need to monitor a service (TSM) that is running with a floating IP on one of two hosts. Clients access the service through its floating service IP address.

Now, when I configure the service to monitor through the host address, everything works fine - but will not cover the failover to the other host.
When I configure the service to monitor through the service IP address, it says the Icinga Instance (the agent) is not connected to the Icinga2 server. Now, thats true.

Thing is, I also want to monitor the hosts, they also got services like cup, file systems etc.

Any idea how to deal with this? Maybe register the same Icinga2 Agent with two names?

Thank you!

Hi.

As an idea:
Is it possible to add a separate/independent IP to each of the hosts and use this for monitoring purposes?
This would avoid problems resulting of usage of the floating IP (VIP).

The service itself - or the failover process - could me monitored otherwise.


Greetings.

Most likely the Host will respond with it’s own IP and not with the floating one. This is at least the case for SNMP monitoring a Windows Cluster. The firewall at your icinga server will drop such packets. Try it with disabled firewall.

Hello @danielbierstedt,
I have experienced this same problem when monitoring Windows Clusters. I added the cluster nodes and can monitor them using the normal Icinga agent. I can NOT monitor the cluster virtual node (different FQDN address than nodes) using the Icinga agent.

The only solution I have found to monitor the cluster virtual node was by using the check_nrpe check command. I know this check command is not secure but I don’t know of another solution. I am also interested in a secure solution to this problem.

Regards
Alex

1 Like

Security is a big word here, so I cannot choose anything that is not encrypted, has no authentication etc.

I’ll leave it as is for now, but I don’t like it. Maybe I switch to check execution by SSH sometime, if necessary. Thats not what I want, espacially with Firewalls to deal with, but lets see.

you could use the nsclient++ api to push nrpe/nscp requests throught, it supports authentication and encryption, but i’m not good enought in security to tell you if it’s a fully reliable.

You could setup the services as passive service - so icinga will just sit & wait to receife results (with a threshold time to go to warning when there was no new status within specified time)

And then do the checks direct on the system (i.e. in bash or powershell) and forward the check results to icinga via API

I do similar with checks if daily backup was successfull on my internet servers.

What do you think?

Greetings from austria
Wolfgang

1 Like

I would do the monitoring of the floating IP from a host which actually would like to connect to that IP or (as an alternative) execute the check on the icinga master instead of the host the application actually runs on. That implicitly also checks the firewalls in between, so I would make sure to check the application on all hosts in the cluster and on top the access over the network.

I also found, that the usual tcp_port check of icinga produces log entries about failed connects, incomplete logins etc.Those then pop up in your log files (which you might also monitor). So, for most of our applications, we developed custom checks which are actually able to not only check the port but also ensure that a session could be established and the monitoring user could log in. Or, if there is a REST API with a status URL behind that port, use the http and/or json check.

Sounds like an idea, I will think about it.