Icinga check_ssh tries 127.0.0.1:22 and I want to use network IP and alternative port

General description of the problem:

I have Icinga2 2.11.3-1 server running in a FreeBSD 11.4 server. The configuration is done in the master server and distributed to the clients. We are monitoring ssh service, however the monitoring is being done to 127.0.0.1 in port 22. Our FreeBSD jails don’t use localhost address, and when it is used, it doesn’t necessary has to be 127.0.0.1. So we try to monitor the public IP in a non standard port.

We are getting false positives that say:

Preformatted text connect to address 127.0.0.1 and port 22: Connection refused

When I manually run the check from IcingaWeb, the alert dissapear and I get a:

SSH OK - OpenSSH_7.5 FreeBSD-20170903 (protocol 2.0)

If it was actually checking 127.0.0.1:22, it should be impossible to get an OK, because we don’t have the 127.0.0.1 IP and we are not using port 22 There are some BSD servers that don’t have a issue. I found a particular case where the server uses a different domain. Most servers are host1.foo.bar, host2.foo.bar… This particular server is otherdomain.com and not a subdomain of foo.bar.

Configuration

Host configuration is allocated in /usr/local/etc/icinga2/zones.d/master/hostname.com.conf. The basic template for host is this:

object Endpoint "host.foo.bar" {
    host = "10.10.10.xx"
}

object Host "host.foo.bar"  {
    import "generic-host"

    address = "10.10.10.xx"
   
    vars.os = "FreeBSD"
    vars.ssh_port = 4545
}

object Zone "host.foo.bar" {
    endpoints = [ "host.foo.bar", ]
    parent = "master"
} 

The ssh service check is defined in: /usr/local/etc/icinga2/conf.d/services/ssh.conf and the original configuration was:

apply Service "ssh" {
  import "generic-service"

  check_command = "ssh"
  vars.port = host.vars.ssh_port

  assign where host.address
}

I added vars.ssh_address = host.address to explicitly say to use the ssh IP.

apply Service "ssh" {
  import "generic-service"

  check_command = "ssh"
  vars.port = host.vars.ssh_port
  vars.ssh_address = host.address

  assign where host.address
}

I decided it was possible because I see that variable defined in /usr/local/share/icinga2/include/command-plugins.conf:

...    
object CheckCommand "ssh" {
        import "ipv4-or-ipv6"

        command = [ PluginDir + "/check_ssh" ]

        arguments = {
                "-p" = {
                        value = "$ssh_port$"
                        description = "Port number (default: 22)"
                }
                "-t" = {
                        value = "$ssh_timeout$"
                        description = "Seconds before connection times out (default: 10)"
                }
                "host" = {
                        value = "$ssh_address$"
                        skip_key = true
                        order = 1
                }
                "-4" = {
                        set_if = "$ssh_ipv4$"
                        description = "Use IPv4 connection"
                }
                "-6" = {
                        set_if = "$ssh_ipv6$"
                        description = "Use IPv6 connection"
                }
        }

        vars.ssh_address = "$check_address$"
        vars.check_ipv4 = "$ssh_ipv4$"
        vars.check_ipv6 = "$ssh_ipv6$"
}
...

Also I also that variable in the documentation.

The error persists, however the variable was defined as I can see in Icinga2 Web under Custom Variables.

From icinga2 object list I get the definition for the ssh check command:

Object 'ssh' of type 'CheckCommand':
  % declared in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1302:1-1302:25
  * __name = "ssh"
  * arguments
    % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1307:2-1329:2
    * -4
      * description = "Use IPv4 connection"
      * set_if = "$ssh_ipv4$"
    * -6
      * description = "Use IPv6 connection"
      * set_if = "$ssh_ipv6$"
    * -p
      * description = "Port number (default: 22)"
      * value = "$ssh_port$"
    * -t
      * description = "Seconds before connection times out (default: 10)"
      * value = "$ssh_timeout$"
    * host
      * order = 1
      * skip_key = true
      * value = "$ssh_address$"
  * command = [ "/usr/local/libexec/nagios/check_ssh" ]
    % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1305:2-1305:39
  * env = null
  * execute
    % = modified in 'methods-itl.conf', lines 19:3-19:23
    * arguments = [ "checkable", "cr", "resolvedMacros", "useResolvedMacros" ]
    * deprecated = false
    * name = "Internal#PluginCheck"
    * side_effect_free = false
    * type = "Function"
  * name = "ssh"
  * package = "_etc"
  * source_location
    * first_column = 1
    * first_line = 1302
    * last_column = 25
    * last_line = 1302
    * path = "/usr/local/share/icinga2/include/command-plugins.conf"
  * templates = [ "ssh", "plugin-check-command", "ipv4-or-ipv6" ]
    % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1302:1-1302:25
    % = modified in 'methods-itl.conf', lines 18:2-18:94
    % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 3:1-3:36
  * timeout = 60
  * type = "CheckCommand"
  * vars
    * check_address
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 4:2-13:3
      * arguments = [ ]
      * deprecated = false
      * name = "<anonymous>"
      * side_effect_free = false
      * type = "Function"
    * check_ipv4 = "$ssh_ipv4$"
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 15:2-15:24
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1332:2-1332:31
    * check_ipv6 = "$ssh_ipv6$"
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 16:2-16:24
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1333:2-1333:31
    * ssh_address = "$check_address$"
      % = modified in '/usr/local/share/icinga2/include/command-plugins.conf', lines 1331:2-1331:37
  * zone = ""

Here an example of a host definition, also from icinga2 object list

Object 'host.foo.bar' of type 'Host':
  % declared in '/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf', lines 5:1-5:42
  * __name = "host.foo.bar"
  * action_url = ""
  * address = "10.10.10.xx"
    % = modified in '/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf', lines 8:5-8:29
  * address6 = ""
  * check_command = "hostalive"
    % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 6:3-6:29
  * check_interval = 60
    % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 3:3-3:21
  * check_period = ""
  * check_timeout = null
  * command_endpoint = ""
  * display_name = "host.foo.bar"
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = false
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 0
  * flapping_threshold_high = 30
  * flapping_threshold_low = 25
  * groups = [ ]
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 2
    % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 2:3-2:24
  * name = "host.foo.bar"
  * notes = ""
  * notes_url = ""
  * package = "_etc"
  * retry_interval = 30
    % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 4:3-4:22
  * source_location
    * first_column = 1
    * first_line = 5
    * last_column = 42
    * last_line = 5
    * path = "/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf"
  * templates = [ "host.foo.bar", "generic-host" ]
    % = modified in '/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf', lines 5:1-5:42
    % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 1:0-1:27
  * type = "Host"
  * vars
    * disks
      * disk /
        % = modified in '/usr/local/etc/icinga2/conf.d/templates/generic-host.conf', lines 8:3-10:3
        * disk_partitions = "/"
    * os = "FreeBSD"
      % = modified in '/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf', lines 10:5-10:23
    * ssh_port = 4545
      % = modified in '/usr/local/etc/icinga2/zones.d/master/host.foo.bar.conf', lines 11:5-11:24
  * volatile = false
  * zone = "master"

I am sure I am missing something very simple. I am still new too Icinga and trying to understund how things work.

The question would be, how to explicitly define the port and the IP for ssh checks?

Thanks a lot,

Cholan.

should be

vars.ssh_port = host.vars.ssh_port

but is not needed anyway as it is already defined at your host object.

Defining vars.ssh_address = host.address will be overwritten by

vars.ssh_address = “$check_address$”

and is also not needed.

You can check the executed command by using inspect in icingaweb.

BTW: Every zone and endpoint object needs to be defined since V2.11 in zones.conf only.

1 Like

Makes a lot of sense, however it still tries to go to 127.0.0.1 and port 22. How i do the manual check the alert is fixed for a couple of minutes and then comes back.

In theory this should be enough:

apply Service "ssh" {
  import "generic-service"

  check_command = "ssh"

  assign where host.address
}

And in a host definition:

object Endpoint "host.foo.bar" {
    host = "10.10.10.xx"
}

object Host "host.foo.bar"  {
    import "generic-host"

    address = "10.10.10.xx"
   
    vars.os = "FreeBSD"
    vars.ssh_port = 4545
}

object Zone "host.foo.bar" {
    endpoints = [ "host.foo.bar", ]
    parent = "master"
} 

I don’t find the inspect option under the service details. I go to problems → service problems. Click on ssh problem on a host and then I have the details. There is where it supposes to be? I thought it should be under Problem handling. Is this the place? Do I have a miss-configuration?

I suffer that several months ago. However, I did find something that might be an issue. Not sure with this issue, but might be a general problem. I share here just in case.

The file host.conf is under zones.d, and it seems not to be process. I copy it to conf.d, as I see in the documentation that it should be there. However it breaks things, this is the content of the file, which I believe is the standard configuration. This is the hosts.conf file:

/*
 * Host definitions with object attributes
 * used for apply rules for Service, Notification,
 * Dependency and ScheduledDowntime objects.
 *
 * Tip: Use `icinga2 object list --type Host` to
 * list all host objects after running
 * configuration validation (`icinga2 daemon -C`).
 */

/*
 * This is an example host based on your
 * local host's FQDN. Specify the NodeName
 * constant in `constants.conf` or use your
 * own description, e.g. "db-host-1".
 */

object Host NodeName {
  /* Import the default host template defined in `templates.conf`. */
  import "generic-host"

  /* Specify the address attributes for checks e.g. `ssh` or `http`. */
  address = "127.0.0.1"
  address6 = "::1"

  /* Set custom variable `os` for hostgroup assignment in `groups.conf`. */
  vars.os = "FreeBSD"

  /* Define http vhost attributes for service apply rules in `services.conf`. */
  vars.http_vhosts["http"] = {
    http_uri = "/"
  }
  /* Uncomment if you've sucessfully installed Icinga Web 2. */
  //vars.http_vhosts["Icinga Web 2"] = {
  //  http_uri = "/icingaweb2"
  //}

  /* Define disks and attributes for service apply rules in `services.conf`. */
  vars.disks["disk"] = {
    /* No parameters. */
  }
  vars.disks["disk /"] = {
    disk_partitions = "/"
  }

  /* Define notification mail attributes for notification apply rules in `notifications.conf`. */
  vars.notification["mail"] = {
    /* The UserGroup `icingaadmins` is defined in `users.conf`. */
    groups = [ "icingaadmins" ]
  }
}

I get configuration errors related to the master server:

[2020-08-21 16:10:58 +0000] critical/config: Error: Validation failed for object 'master.foo.bar!ssh' of type 'Service'; Attribute 'command_endpoint': Checkable with command endpoint requires a zone. Please check the troubleshooting documentation.
Location: in /usr/local/etc/icinga2/conf.d/services/ssh.conf: 1:0-1:18
/usr/local/etc/icinga2/conf.d/services/ssh.conf(1): apply Service "ssh" {
                                                    ^^^^^^^^^^^^^^^^^^^
/usr/local/etc/icinga2/conf.d/services/ssh.conf(2):   import "generic-service"
/usr/local/etc/icinga2/conf.d/services/ssh.conf(3): 	

Thanks a lot.

And this is the reason why ssh tries 127.0.0.1 on port 22 since it just does what is configured.

This one is different and I’d suspect you have defined your host object twice.

This does not makes sense as ssh should not be executed on a host itself, hence, command_endpoint should not be set (which you examples even don’t show - I suspect you have additional definitions somewhere else).

Inspect can be found here:
Inspect

Only if the Icinga Director is installed though.

Is the comma really existing here or is it a typo?

Where is the command_endpoint option set?
Can you show the generic-service template please?

Ah, ok. Thanks! …

I thought that too, but the file wasn’t processed. (I think do to the upgrades in Icinga 2.11) I removed just to be sure and the problem persists.

It has to be something like this. I installed the Director module for Icinga2 web and now I have the inspect button. So sometimes the ssh_check plugin checks 127.0.0.1 and other times the public ip + port.

For instance:

'/usr/local/libexec/nagios/check_ssh' '127.0.0.1'

Then I do a manual check and I get this:

'/usr/local/libexec/nagios/check_ssh' '-p' '4242' '10.10.10.xx'

Then after a few minutes the alert comes back again. Is there any difference on how Icinga process the checks automatically vs manual checks from the web?

Thanks

It was there, and it was working. I just removed it.

From the template definition it says command_endpoint = host.name

template Service "generic-service" {
  max_check_attempts = 5
  check_interval = 5m
  retry_interval = 30s
  command_endpoint = host.name
}

Thanks again for your time and patience.

This happens for the same host?
With manual check you me you hit ‘check now’ on the webinterface?

Have you tried removing this from the template?
command_endpoint = host.name would mean that the check having that option is executed on the remote host, which needs the Agent installed.

1 Like

It happens with the hosts I have problems.

yes

What I actually did is overwriting this value in ssh.conf so that the command would be run in master.foo.bar. It seems to have fixed the issue, however I would give it 24 hours before celebrating.

command_endpoint = "master.foo.bar"

Thanks a lot again.

1 Like

Everything ok, thanks a lot @log1c!