UI Actions/Commands Timeout

Spikelite · May 22, 2019, 5:46pm

Hello,

We have an installation of IcingaWeb2 v2.6.1 running on Icinga2 r2.9.1-1 running within AWS with dozens of satellite nodes connecting back to the master node. When I setup this installation it used a DNS entry for the master node rather than hard coding an IP (this may be relevant further into the description of the problem).

A week ago AWS had some sort of hiccup which resulted in two things happening, the master node was shut down and left offline for a protracted period of time, and they lost our Elastic IP and assigned a new Ephemeral IP to the master, this effectively broke all of our security groups, the DNS entry for the master node, and a number of other things which were tied back to this static IP that was turned into a dynamic IP by AWS.

I have assigned a new Elastic IP to the master node, I have updated the DNS A record for the master node so that it is pointing to the new EIP correctly, and I have updated all of our security groups. The monitoring is working properly and we are getting results from all of our satellite nodes. The web interface works fine and you can review data in it without issue. I have also rebooted the master since all of these recovery steps were taken.

The Problem:
Whenever I try to perform an action via the WebUI such as “Process Check Result”, “Send Notification”, “Schedule Downtime”, “Check Now”, etc. I get a timeout, and all of the timeouts look about the same, see the attached screen shot as an example.

No significant configuration changes have been made to the system before or after the issue with AWS, so I do not currently suspect an issue along those lines. We are still able to use IcingaDirector to add and remove hosts and it appears that the API is working properly, so this might be isolated to just the IcingaWeb2 interface.

I suspect that somehow the old IP address of the master node was somehow cached or recorded somewhere despite using a DNS entry for it, but I have been unable to find that IP anywhere in the configuration, nor find any evidence to support this theory. Any ideas on things I should check or things I might do which will resolve the issue would be greatly appreciated.

P.S. I am aware that the running versions are not fully up to date, and we are planning to upgrade once we have resolved this issue.

blakehartshorn · May 22, 2019, 5:57pm

Are you able to perform the same action connecting to the master by IP address using icingaweb’s api username and password?

https://icinga.com/docs/icinga2/latest/doc/12-icinga2-api/#configuration-management

And is the icingaweb2 server remote or local? You mentioned you scrambled through the config, but just double checking that modules/monitoring/commandtransports.ini was one of them?

Spikelite · May 22, 2019, 6:02pm

@blakehartshorn you rock, that was 100% the problem, the IP was hard coded in modules/monitoring/commandtransports.ini and after updating it there things started working again! I had failed to go that deep into the configuration as I didnt think that values were hard coded into the modules directory, but that assumption was obviously wrong. Thank you so much for the help!

blakehartshorn · May 22, 2019, 6:04pm

Cool. If icingaweb2 is running on the same server as icinga2, just go ahead and set that to localhost.

-Blake