We have an installation of IcingaWeb2 v2.6.1 running on Icinga2 r2.9.1-1 running within AWS with dozens of satellite nodes connecting back to the master node. When I setup this installation it used a DNS entry for the master node rather than hard coding an IP (this may be relevant further into the description of the problem).
A week ago AWS had some sort of hiccup which resulted in two things happening, the master node was shut down and left offline for a protracted period of time, and they lost our Elastic IP and assigned a new Ephemeral IP to the master, this effectively broke all of our security groups, the DNS entry for the master node, and a number of other things which were tied back to this static IP that was turned into a dynamic IP by AWS.
I have assigned a new Elastic IP to the master node, I have updated the DNS A record for the master node so that it is pointing to the new EIP correctly, and I have updated all of our security groups. The monitoring is working properly and we are getting results from all of our satellite nodes. The web interface works fine and you can review data in it without issue. I have also rebooted the master since all of these recovery steps were taken.
Whenever I try to perform an action via the WebUI such as “Process Check Result”, “Send Notification”, “Schedule Downtime”, “Check Now”, etc. I get a timeout, and all of the timeouts look about the same, see the attached screen shot as an example.
No significant configuration changes have been made to the system before or after the issue with AWS, so I do not currently suspect an issue along those lines. We are still able to use IcingaDirector to add and remove hosts and it appears that the API is working properly, so this might be isolated to just the IcingaWeb2 interface.
I suspect that somehow the old IP address of the master node was somehow cached or recorded somewhere despite using a DNS entry for it, but I have been unable to find that IP anywhere in the configuration, nor find any evidence to support this theory. Any ideas on things I should check or things I might do which will resolve the issue would be greatly appreciated.
P.S. I am aware that the running versions are not fully up to date, and we are planning to upgrade once we have resolved this issue.