Can't monitor simple Windows Agent Disk Check from Master

Hello guys

I’ve been trying to set up a simple disk check on a remote Windows Agent for 3 days, without any luck.
I’m undoubtedly doing something wrong, but after reading the docs at least 5 times I can’t figure it out.

So our setup is simple now: we have one master called “nicklink-mon.nicklink.network” and one Windows Agent called “nicklink-ad1.nicklink.network”.

All below configurations are for the master!

zones.conf

object Zone “oostakker-hq” {
endpoints = [ “nicklink-mon.nicklink.network”, “nicklink-ad1.nicklink.network” ]
}
object Endpoint “nicklink-ad1.nicklink.network” {
host = “10.88.2.5”
}
object Endpoint “nicklink-mon.nicklink.network” {
host = “10.88.2.21”
}

hosts.conf in zones.d/oostakker-hq

object Host “nicklink-ad1.nicklink.network” {
import “generic-host”
check_command = “hostalive”
address = “10.88.2.5”
vars.agent_endpoint = “nicklink-mon.nicklink.network”
vars.client_endpoint = name
vars.os_type = “Windows”
vars.os_type = “Server”
}

services.conf in zones.d/oostakker-hq

apply Service “C: Check” {
check_command = “disk-windows”
command_endpoint = “nicklink-ad1.comlink.network”
vars.disk_win_path = “C:”
vars.nscp_cpu_showall = true
assign where host.vars.client_endpoint
}

Windows Agent
I did set up the Windows Agent to accept connections on 5665, allow to receive commands and configs…
I used the commandline to set it up, everything looks correct. I signed the certificates on the master.

IcingaWeb2
In IcingaWeb2 I get this error for the service:

Remote Icinga instance ‘nicklink-ad1.nicklink.network’ is not connected to ‘nicklink-mon.nicklink.network’

Honestly any hint in the right direction would be so helpful, my friend and I have been on this for 3 days and we really can’t figure it out.

The agent has to have its own zone and endpoint objects.
The agent zone then has to have the master zone “oostakker-hq” as a parent.

Right now your monitoring server and the windows agent are configured as a cluster pair inside the same zone, which will not work.

https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#distributed-monitoring-top-down-command-endpoint should be the right section in the docs :slight_smile:

Alright that was very helpful! I changed the configs, checked and restarted but the error persists. I’ve since then added a Linux client with the Linux Agent and I get the same error…

Remote Icinga instance ‘icinga2-linux-test.nicklink.network’ is not connected to ‘nicklink-mon.nicklink.network’

I let the Wizard auto-create this in the zones.conf of icinga2-linux-test.nicklink.network, does this look correct?

object Endpoint “nicklink-mon.nicklink.network” {
host = “10.88.2.21”
port = “5665”
}
object Zone “oostakker-hq” {
endpoints = [ “nicklink-mon.nicklink.network” ]
}
object Endpoint “icinga2-linux-test.nicklink.network” {
}
object Zone “icinga2-linux-test.nicklink.network” {
endpoints = [ “icinga2-linux-test.nicklink.network” ]
parent = “oostakker-hq”
}

The configs are exactly like this, but with the crucial information edited to my scenario:
https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#distributed-monitoring-master-agents
And I set up both agents with the node wizard, like shown here:
https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#agentsatellite-setup-on-linux

Is there any logs I can share? I already checked firewalld and the ports should be open…

Where does this endpoint come from?
It is not defined anywhere.

Did you run the node wizard on the master as well?

Sorry that’s supposed to be nicklink-mon.nicklink.network too, it’s correct in the configs though.
I also re-ran the node wizard on my master, but denied the inclusion of conf.d directory this time.
The issue persists oddly enough… I can also reach all my Icinga instances with nmap on their port.

So the zones.conf on the master now should look like this or similar:

object Zone “oostakker-hq” {
     endpoints = [ “nicklink-mon.nicklink.network” ]
}
object Endpoint “nicklink-mon.nicklink.network” {
}

object Endpoint “nicklink-ad1.nicklink.network” {
     host = “10.88.2.5”
}
object Zone “nicklink-ad1.nicklink.network” {
     parent = “oostakker-hq”
}

object Endpoint “icinga2-linux-test.nicklink.network” {
}
object Zone “icinga2-linux-test.nicklink.network” {
    endpoints = [ “icinga2-linux-test.nicklink.network” ]
    parent = “oostakker-hq”
}

Did you restart icinga2 on the master after adding/changing to zonse&endpoints in zones.conf?

My zones.conf looks very similar, so much that I believe it’s fully correct.

object Endpoint “nicklink-mon.nicklink.network” {
}
object Zone “oostakker-hq” {
endpoints = [ “nicklink-mon.nicklink.network” ]
}
object Endpoint “nicklink-ad1.nicklink.network” {
host = “10.88.2.5” // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for nickmand endpoint agents
}
object Endpoint “icinga2-linux-test.nicklink.network” {
host = “10.88.2.172”
log_duration = 0
}
object Zone “nicklink-ad1.nicklink.network” {
endpoints = [ “nicklink-ad1.nicklink.network” ]
parent = “oostakker-hq”
}
object Zone “icinga2-linux-test.nicklink.network” {
endpoints = [ “icinga2-linux-test.nicklink.network” ]
parent = “oostakker-hq”
}

I’ve restarted multiple times and I always perform a config check before, which shows no errors other than a warning:

warning/config: Ignoring directory ‘/var/lib/icinga2/api/zones/master’ for unknown zone ‘master’.

Do you think at this point I should try to reinstall my entire master node?

You have been very helpful so far, thank you very much

Looks like icinga expects a zone named “master”.
Did you rename your zone at some point?

Try rm -fR /var/lib/icinga2/api/* to delete “old stuff”.

Well I’ve been debugging for 3 days+ so that might have happened at some point. I did your command but the issue persists.
However… When I click “Check Now” I get this error in IcingaWeb2:

icinga2: Can’t send external Icinga command: 401 Unauthorized. Please check your user credentials.

#0 /usr/share/icingaweb2/modules/monitoring/application/forms/Command/Object/CheckNowCommandForm.php(76): Icinga\Module\Monitoring\Command\Transport\CommandTransport->send(Object(Icinga\Module\Monitoring\Command\Object\ScheduleServiceCheckCommand))
#1 /usr/share/php/Icinga/Web/Form.php(1171): Icinga\Module\Monitoring\Forms\Command\Object\CheckNowCommandForm->onSuccess()
#2 /usr/share/icingaweb2/modules/monitoring/library/Monitoring/Web/Controller/MonitoredObjectController.php(325): Icinga\Web\Form->handleRequest()
#3 /usr/share/icingaweb2/modules/monitoring/library/Monitoring/Web/Controller/MonitoredObjectController.php(163): Icinga\Module\Monitoring\Web\Controller\MonitoredObjectController->setupQuickActionForms()
#4 /usr/share/icingaweb2/modules/monitoring/application/controllers/ServiceController.php(85): Icinga\Module\Monitoring\Web\Controller\MonitoredObjectController->handleCommandForm(Object(Icinga\Module\Monitoring\Forms\Command\Object\AcknowledgeProblemCommandForm))
#5 /usr/share/icingaweb2/library/vendor/Zend/Controller/Action.php(507): Icinga\Module\Monitoring\Controllers\ServiceController->acknowledgeProblemAction()
#6 /usr/share/php/Icinga/Web/Controller/Dispatcher.php(76): Zend_Controller_Action->dispatch(String)
#7 /usr/share/icingaweb2/library/vendor/Zend/Controller/Front.php(937): Icinga\Web\Controller\Dispatcher->dispatch(Object(Icinga\Web\Request), Object(Icinga\Web\Response))
#8 /usr/share/php/Icinga/Application/Web.php(300): Zend_Controller_Front->dispatch(Object(Icinga\Web\Request), Object(Icinga\Web\Response))
#9 /usr/share/php/Icinga/Application/webrouter.php(99): Icinga\Application\Web->dispatch()
#10 /usr/share/icingaweb2/public/index.php(4): require_once(String)
#11 {main}

Please check user credentials, I wonder what credentials?

I’m going to reinstall my entire master now

Hello there,
did the reinstall fix your issue?

Strangely enough it did in a strange way. I reinstalled, set up everything again from scratch (agents too) and got the same ‘not connected’ error.
I got frustrated and left it overnight.
The following morning in the office I checked and it discovered both services!
So I guess you could close this thread? Didn’t find a real solution though

That is rather strange indeed…

The only thing that I can possibly think of right now, is that the hardware might have not been able to handle the load? Doesn’t make all that much sense either though…

Well, I guess I will just unhelpfully make your ‘resolution post’ the solution for now, if anyone else happens to have the issue too, or it returns for you, I hope they will reopen this by commenting :slight_smile:

Good luck with your future endeavours with Icinga!
Feu