Windows Agent not Connecting

I’ve been using Icinga2 with little to no issues for a long while now. We have the agent automatically deploying and everything has been working perfectly for quite some time.

Today, I have a Windows server that was previously working with the agent all of the sudden reporting it has no connection with icinga2.

I tried my usual fixes that always have worked in the past. For example uninstalling it and letting it reinstall. This didn’t resolve the issue. I also disabled the firewall on both the Windows server and the Icinga2 server for troubleshooting purposes. This did not help either.

I see no errors or warnings in the deployments through director and I can’t think of anything that has changed recently except for cutting over the Icinga2web to https (this works fine).

All other Windows agents continue to connect and monitor services as expected.

What could be causing this? Where should I look?

In such cases I always start with looking into the logs, especially affected agent and its parent(s). My typical errors are hostname-certificate-missmatch, missing parents host ip, unsigned certificate…

1 Like

Thanks, I see nothing in the icinga log file of interest on the Windows host. What is odd to me is that all other Windows hosts are working fine and the agent deployment is controlled/cookie-cutter.

This could be a certificate issue but I’m not really sure where to look. I ran icinga2 ca list on the Icinga2 server and there are none. Also nothing is listed with --all.

I think I see the problem in the server icinga2 log. It is falling to connect to the host on the port with connection refused. I will keep digging.

Certificate issues are reported in the logs. If nothing is reported than typically none of the partners tries to connect (typical due to missing “contact” config). Here you’ll find some troubleshooting tips.

Thanks. Right now I think the Windows agent is not listening on 5665… weird… but that is where I’m at now. Trying to understand why.

So it seems like my problem is that the Icinga2 agent on the Windows server will not listen on 5665.

I tried uninstalling it and reinstalling it. The service is running. But no matter what I try I never see it listening on 5665. Other Windows servers that are working show the 5665 port and a connection to the Icinga2 server as expected.

Where should I look for why the icinga service wouldn’t be listening on 5665?

I ran the Icinga2 Windows Agent install via the Powershell script that we have installing it manually to watch what happens.

Everything looked clean except for this:

It seems like there is a problem with getting the conf file and maybe that is why it isn’t listening. Any idea where to look from here? I’m wondering if this has to do with the API and our recent change to SSL/https instead of http

As you use the Director: Do you deploy the agent with the Director-given host template token and the code snippet?

As there is an “Error 401 Unauthorized”:
Did you, maybe, change the API user/password for the director? Or changed the agent key/token for the host template?
You also write “change to https”: Is your code snippet maybe still with http instead of https?

A tip for reinstalling the agent:

  • uninstall the agent
  • then delete c:\program data\icinga2
    so the system is completely clean from the previous icinga2 install.

Are you using a self signed certificate which is not stored at the local certificate store of that Windows machine? If so you need to add -IgnoreSSLErrors (assuming you use icinga2-powershell-module).

1 Like

BTW: You should take a look at the new Icinga for Windows Packages which will replace the old icinga2-powershell-module.

1 Like

HI,

have you tried starting icinga from command line with icinga2.exe daemon to see whats happening?

How is your connection configured? The master/satellite is connecting to the agent ?

In my environments in most cases its the network team which kills my connections or the Virusscanner/appsecurity has got some “hardening” :slight_smile: In rare cases I ****** up Icinga and do not know that i have done it :wink:

I am using Director and the powershell script. The agent installer is on a share and I configured it all in director. No passwords were changed.

I upgraded Director last night to the latest version just to make sure I was up to date.

Yes, I discovered the programdata thing early on when initially deploying the agent :slight_smile:

The interesting thing is the I never had this problem before and there have been no changes to anything other than modifying icinga2 to use https instead of http and the addition of the VMware module for vSphere which required a few modules be updated to their latest versions.

If I curl the API url with https from a linux box it detects a self signed certificate (our internal CA) and complains/fails but if I pass -k to ignore the detection it proceeds with no issue. I tried adding the root and intermediate certificates to the linux box I was curling from but it still complained and I had to ignore it to proceed.

I’m going to look at this right now. Anything that would simplify the Windows agent process would be great as of now I am doing the powershell module with Director and deploying via SCCM. It has worked flawlessly since I initially configured it quite some time ago.

In this case it seems that the problem is communicating with the Icinga2 server and getting the config for the Windows agent over the API. I have ruled out firewalls, networking issues etc…

It seems like this is more of a set of scripts/modules to run powershell checks on an existing agent installation? It is a little confusing.

Yes, and it handles the installation of the Windows agents also.

OK. Sounds like an updated version of what I am currently using.

Whatever my issue is seems to be with the agent installation not being able to use the self service api in director any longer.

Did you try with the parameter -IgnoreSSLErrors?