Problems with command parsing from satellite to agent (no response)

nva · March 13, 2019, 8:48am

Hello all,

It’s been a while since I was able to work on our internal project, but I’m having problems with command parsing. I’ve defined several services, but it’s not working as it should. The error the director is giving me is “No data received”.

I’ve got a master (named mon001) a satellite (named sat001) and a terminal server (ts001 with the agent). The master is installed off-site and the agent is installed in the internal network, together with the agent.

The communication between the master, satellite and agent is working fine.

The firewall ports are configured properly (I can connect from the satellite to the agent and I get results returned on the CLI when running. So the allowed_hosts and password in nsclient.ini is configured correctly.

The command from the satellite to the agent works (tested using CLI):

When I run: “./check_nscp_api -H ts001.fqdn.local -P 8443 --password XxXxXx -q check_drivesize -a drive=c:” from the CLI, it returns: “check_drivesize OK All 1 drive(s) are ok | ‘c: used’=29.379845GB;79.557028;89.501657;0;99.446285 ‘c: used %’=30%;80;90;0;100”

Whenever I run this through the director and tell it to use the satellite agent, I’m receiving “No data received”.

I’ve set up datafields:

I’ve configured the fields with the command

I’ve created the arguments for the command

I’ve defined a command

I’ve configured the parameters for the service

What is going wrong here?

nva · March 13, 2019, 8:51am

The message that I’m receiving:

P.S.: If I create an error by mistake like removing the IP address it throws an TCP/IP error at me or an unknown host. So there seems to be some form of communication.

aflatto · March 13, 2019, 9:01am

Hi

Can you show the service “inspect” output ? specifically the command print out ?

You should enable the debug log and capture the command execution of the service to make sure that the command you think you build does match the one you are running in the cli ?

nva · March 13, 2019, 9:11am

Dear Assaf,

I’m not 100% sure what you mean. Is down below what you mean?

object Service “check_nscp_free_disk_space_c” {
host_name = “ts001.fqdn.local”
check_command = “check_nscp_api”
max_check_attempts = “5”
check_period = “24x7”
check_interval = 1s
retry_interval = 1s
enable_notifications = true
enable_active_checks = true
enable_event_handler = true
enable_flapping = true
enable_perfdata = true
command_endpoint = host_name
vars.argument = “‘drive=C:’”
vars.ipaddress = “‘ts001.fqdn.local’”
vars.password = “’’”
vars.query = “‘check_drivesize’”
}

EDIT: The log belog is the debuglog on the master node (mon001). I assume is this is what you need? As you can see it does report about ts001.fqdn.local
[2019-03-13 10:07:55 +0100] notice/JsonRpcConnection: Received ‘event::CheckResult’ message from ‘ts001.fqdn.local’
[2019-03-13 10:07:55 +0100] debug/Checkable: Update checkable ‘ts001.fqdn.local!check_nscp_free_disk_space_c’ with check interval ‘1’ from last check time at 2019-03-13 10:07:55 +0100 (1.55247e+09) to next check time at 2019-03-13 10$
[2019-03-13 10:07:55 +0100] debug/DbEvents: add checkable check history for ‘ts001.fqdn.local!check_nscp_free_disk_space_c’

log1c · March 13, 2019, 9:29am

Assaf meant the output behind the “Inspect link” on the service overview page:

Check if all the arguemnts are filled correctly by your fields.

Out of interest: Why did you define a new command for the nscp_api and did not use the one imported one from the ITL?

nva · March 13, 2019, 9:34am

Thank you for the clarification

I’m sorry for giving stupid answers sometimes, but I’m new to Icinga 2. In the past I used Nagios, but felt it was necessary to change sometime and since Ubuntu 14.04 is running out of support next month I thought this was the moment to do so.

The check result is as below:

Incorrect obviously, but how to correct it?

log1c · March 13, 2019, 9:52am

It looks like that you have the icinga2 agent installed on the ts001 host, is that correct?

If yes, then you don’t need the nsclient anymore, as the icinga2 agent can run most/all checks out of the box.

As the checks are run locally on the client you can try the disk-windows command

If you really want to use the nsclient for the checks then you would have to change the check so that it is executed by the satellite and not the agent host itself.

On way could be setting the option “Run on agent” to no on the service template (not 100% sure if this is the correct way to do).

nva · March 13, 2019, 9:59am

Thank you for the clarification and yes that’s correct. The agent is installed on TS001. The thing is that we want to monitor the disks individually with customized thresholds, as we’ve got more than 1 Terminal Sever. Is this possible with the check_disk.exe using arguments?

EDIT: Update: Changing the “Run on agent” explicitly to no resolved the issue.

Thanks for the assistance

nva · March 13, 2019, 10:16am

BTW: Thank all of you for the amazing support, I must say that the support and assistance in the icinga2 community is amazing

log1c · March 13, 2019, 10:23am

Yes, the check_disk.exe command also supports thresholds, just check the preview of the command in the director.
You can then either create various apply rules for your different disks/thresholds or just use one apply rule for each disk and overwrite the thresholds for each host individually. Thats a function the Director offers.
Just go to the host object, open the services tab, choose the service, e.g. “disk-c” and enter the new threshold values in the correct field, and apply the overwrites.

nva · March 13, 2019, 10:35am

Thank you for the update. I’m learning new things today. I’ve just checked the option, but this will not be possible in my case, as the satellite connects through a proxy construction (it’s used as forefront for the agents), so it can traverse NAT in various situations to the master server connected using a public IP address (firewalled of course). So the checks need to be performed by the satellite.