Architecture question: Agents at sites not reachable directly

Hello,

I’m facing a design where we got agents at sites that do have internet access, but cannot be reached directly (no port forwarding etc).

Does anybody know it its possible to let the agent connect to the master and deliver results etc without the agent being accessable through port 5665?

Yes - you specify the address of the master on the agent (as its parent) but
you do not specify the address of the agent on the master.

Icinga will happily connect either way round; this forces the agent to connect
to the master and prevents the master from trying something that won’t work.

Once a connection is in place, started from either end, Icinga just works.

Antony.

3 Likes

Thanks man, that helps a lot.

I noticed that I need to understand the differences btw active and passive checks better. If I shut down the port 5665 incoming, checks are not being scheduled as planned anymore. So I maybe need to understand how to let the agent do its own scheduling.

I think I don’t understand this situation enough. Here is my test setup:

master: icinga-master
agent: icinga-setup

I simulate the situation by denying port 5665 incoming at the agent

root@icinga-setup:~# ufw status numbered
Status: active

 To                         Action      From
 --                         ------      ----

[ 1] 5665 DENY IN Anywhere
[ 2] 5665 (v6) DENY IN Anywhere (v6)

Traffic Agent → Master

root@icinga-setup:~# nc -vz icinga-master 5665
Connection to icinga-master (172.24.141.153) 5665 port [tcp/*] succeeded!

Traffic Master → Agent

root@icinga-master:/etc/icinga2/zones.d/master# nc -vz icinga-setup 5665
^C
root@icinga-master:/etc/icinga2/zones.d/master#

Ping does not work, so I replaced it with a dummy check

template Host “generic-host” {
max_check_attempts = 3
check_interval = 1m
retry_interval = 30s

check_command = “dummy”
vars.dummy_state = “0”
vars.dummy_text = “This is a dummy”
}

After a few minutes, everything goes to “unknown” and “overdue” as I would expect it to happen if the agent is not reachable.

So I think I’m missing something in the direction of how to configure the master to not try to check by himself but to wait for results and how to configure the agent to send data instead of waiting for commands.

For the complete picture, I configured both using the node wizard as if they were in the same network and then shut down the agents firewall for incoming traffic.

I’m also interested in how this would work. Since looking at this setup, I’m asking myself how the agent would know when to check itself since it can not receive any configuration info like check intervall etc from the master.
Also I dont know how the agent would even schedule its own checks.

Please share the agent hosts and service configuration.
Is it done via the Icinga Director or via the config files (by hand).

Via the Director you should simply enable these settings:

“Icinga2 Agent = Yes” will create the necessary “Endpoint” and “Zone” for the agent, which, together with “Accepts config”, should switch the check execution to the agent IF the service template has the command_endpoint attribute set (Run on Agent = Yes in the Director)

1 Like

What exactly do you need? This is a very basic setup. All Agents being setup with accept_config = yes and accept_commands = yes

This was done by hand, never used the director. What is the equivalent to “Establish connection” within the config files?

The Agent config on the master:

object Endpoint “icinga-setup”{
host = “icinga-setup”
}

object Zone “icinga-setup” {
endpoints = [ “icinga-setup” ]
parent = “master”
}

object Host “icinga-setup” {
import “generic-host”
vars.agent_endpoint = name

vars.os = “Linux”

vars.disks[“disk”] = {
/* No parameters. */
}
vars.disks[“disk /”] = {
disk_partitions = “/”
}

vars.notification[“mail”] = {
groups = [ “icingaadmins” ]
}
}

As already said, setup is very basic with accept commands and accept config. The idea of a centralized configuration will most likely not work if the Agent will not connect and ask for updates itself - yes or no?

the config on the agent need the ip address set for the master
here is an exaple of a zones.conf file that connect to the master:

object Endpoint "LAPTOP-C40" {
}

object Endpoint "icinga-master" {
    host = "192.168.100.100";
}

object Zone "master" {
    endpoints = [ "icinga-master" ];
}

object Zone "LAPTOP-C40" {
    parent = "master";
    endpoints = [ "LAPTOP-C40" ];
}

object Zone "director-global" {
    global = true;
}

object Zone "global-templates" {
    global = true;
}

this is enough to tell your icinga2 agent that he has to connect to the master

1 Like

As @moreamazingnick said, setting the host attribute for the master in the Agents zones.conf file is enough to tell the agent to connect to the master.
Therefore you then don’t need the host attribute set in the masters configuration for the agent.

Windows agent example from our setup (done via Director)

zones.d/private_vcloud/hosts.conf
object Host "ac01" {
    import "windows-agent-host-template"
    import "private_vcloud-host-template"

    address = "10.110.0.18"
}

zones.d/private_vcloud/agent_endpoints.conf
object Endpoint "ac01" {
    log_duration = 0s
}

zones.d/private_vcloud/agent_zones.conf
object Zone "ac01" {
    parent = "private_vcloud"
    endpoints = [ "aep-ac01" ]
}

To then have the check executed on the agent the service template needs the command_endpoint attribute set. That’s why I wanted to see one of the service configs.

Another example:

template Service "windows-agent-powershell-service-status" {
    import "generic-service-template-15min"

    check_command = "windows_service-status"
    command_endpoint = host_name

}

And to have it all work the API feature on the agent needs

object ApiListener "api" {
  accept_config = true
  accept_commands = true
}
1 Like

I just wento onto the Agent and set the master address in the zones.conf:

object Endpoint “icinga-master” {
host = “icinga-master”
}

object Zone “master” {
endpoints = [ “icinga-master” ]
}

object Endpoint “icinga-setup” {
}

object Zone “icinga-setup” {
endpoints = [ “icinga-setup” ]
parent = “master”
}

object Zone “global-templates” {
global = true
}

object Zone “director-global” {
global = true
}

AND IT WORKS! Thanks for giving me this additional hint.

And yes, the services are being told to run on the Agent and not on the master. I don’t really see the benefit in running something on the master if it isnt some kind of “check through SSH” scenario.

apply Service “apt” {
import “generic-service”
command_endpoint = host.vars.agent_endpoint
check_command = “apt”
vars.apt_only_critical = true
assign where host.vars.os == “Linux”
}

I set the config and commands during the agent configuration:

object ApiListener “api” {

accept_config = true
accept_commands = true
}

icinga2 node setup --cn $ICINGA_AGENT --endpoint $ICINGA_MASTER --zone $ICINGA_AGENT --parent_zone master --accept-commands --accept-config --disable-confd

Last thing I cant figure out: if the Agent now runs “stand alone” and cannot get config update through API, where is the Agents config stored so I can manipulate myself? The standard config in /etc/icinga2/conf.d has been excluded. Do I simply include it or do you suggest to find and update the config created by the API?

What makes you think/say that? Without a connection to the master/parent the agent doesn’t do anything, because it has no config.

Synced configuration ends up in /var/lib/icinga2/api/.
It will get replaced/updated with every agent reload or config deployment.

For the case that the agent is not connected you could have some form of check (executed by the parten, master in your case) that monitors the connection status.
The cluster-zone check command would be suitable for that. Either configure that as an additional service check or even as the host check command (instead of ping/hostalive/…)

Mostly lack of knowledge and experience. Sync did not work because of incorrect config of zones on the agent. Now, after I corrected the agent zones.conf, updates also work. This means, despite my lack of experience, my problem has been solved.

I don’t see anything I could configure in the given api path, but as long as the config sync from the master works, everything is good.

This is the given api path, but nothing in there

drwxr-x— 7 nagios nagios 4096 Sep 18 10:22 ./
drwxr-x— 4 nagios nagios 4096 Sep 18 12:52 …/
drwxr-x— 2 nagios nagios 4096 Sep 18 10:22 log/
drwx------ 3 nagios nagios 4096 Sep 10 08:30 packages/
drwxr-xr-x 2 root root 4096 Apr 22 09:41 repository/
drwxr-x— 2 nagios nagios 4096 Apr 22 09:41 zones/
drwx------ 2 nagios nagios 4096 Sep 18 10:28 zones-stage/

Apart from the folders that hold the configuration the agent got from the master :slight_smile:
The whole process is explained in the docs: technical-concepts/#config-sync

Glad it’s working now!

1 Like