Need help to understand Icinga Master => Satellite => Agent

Hi,

I really have many problem to understand how to configure it…

I want to have following:

  • Master (testmaster.mydomain.tld)
  • Agent “child of master” (testagentdebian.mydomain.tld)
  • Satellite (testsatellite.mydomain.tld)
  • Agent “child of satellite” (testagentcentos.mydomain.tld)

So I configured these endpoints:

object Endpoint “testmaster.mydomain.tld” {
host = “testmaster.mydomain.tld”
port = 5665
}

object Endpoint “testsatellite.mydomain.tld” {
host = “testsatellite.mydomain.tld”
port = 5665
}

object Endpoint “testagentdebian.mydomain.tld” {
host = “testagentdebian.mydomain.tld”
port = 5665
}

object Endpoint “testagentcentos.mydomain.tld” {
host = “testagentcentos.mydomain.tld”
port = 5665
}

and these zones:

object Zone “testmaster.mydomain.tld” {
endpoints = [ “testmaster.mydomain.tld”, “testagentdebian.mydomain.tld” ]
}

object Zone “testsatellite.mydomain.tld” {
endpoints = [ “testsatellite.mydomain.tld”, “testagentcentos.mydomain.tld” ]
parent = “testmaster.mydomain.tld”
}

In my master I have a directory zones.d with drei subdirectories: global, testmaster.mydomain.tld and testsatellite.mydomain.tld
In testmaster.mydomain.tld I defined the hosts and services for testmaster and testagentdebian and in testsatellite.mydomain.tld I defined the hosts and services for testmaster and testagentcentos

Unfortunately I can’t check testagentcentos…
In Icingaweb2 I always get:

Remote Icinga instance ‘testagentcentos.mydomain.tld’ is not connected to ‘testsatellite.mydomain.tld’

and I really don’t understand how to configure my system…

Thanks a lot for your explanation!
Luca

You should define one zone for each of your machines, and that machine should
be the only one listed in its zone.

testagentdebian should have testmaster as its parent.

testsatellite should have testmaster as its parent.

testagentcentos should have testsatellite as its parent.

I hope that helps.

Antony.

Hi Antony,

thank you for your answer…
I really can’t understand what is the sense of a zone with just the host self, but I tried it.
My new zones.conf:

object Zone "testmaster.mydomain.tld" {
  endpoints = [ "testmaster.mydomain.tld" ]
}

object Zone "testagentdebian.mydomain.tld" {
  endpoints = [ "testagentdebian.mydomain.tld" ]
  parent = "testmaster.mydomain.tld"
}

object Zone "testsatellite.mydomain.tld" {
  endpoints = [ "testsatellite.mydomain.tld" ]
  parent = "testmaster.mydomain.tld"
}

object Zone "testagentcentos.mydomain.tld" {
  endpoints = [ "testagentcentos.mydomain.tld" ]
  parent = "testsatellite.mydomain.tld"
}

… but it does not help…
Same error…

Any other idea?

Thanks
Luca

Hi Antony,

thank you for your answer…
I really can’t understand what is the sense of a zone with just the host

I agree with you. I also think the name “zone” is highly misleading, because
it tends to make people think of groups of machines, or geographical regions
etc, when that’s really not what an Icinga zone is :frowning:

Same error…

What do you see in /var/log/icinga2/icinga2.log on the machines which are not
communicating?

Antony.

Hi Antony,

So, I got it…
I need to copy the zones.conf on every hosts and build there the inheritances, so every host has a quite similar (but not equal) zones.conf.

I really can’t understand that… For me the definition of Endpoint should be enough, if a zone just has one Endpoint…

Well, at least I solved my problem!
But now another question: do I understand correctly, that I can distribute the configuration?
Then, why I cannot distribute the zones.conf, too? Is there any possibility to do that?

Thanks
Luca

Hi,

let’s try to explain this with examples:

Think about a big IT infrastructure. You have a lot of network devices, servers, storage systems and because of load-balacing and safeguarding against failure you install many icinga nodes which should check your infrastructure. If you would have only one node and this is down/maintenance/… your blind.

object Endpoint “node1” {
host = “icinga_node_1”
port = 5665
} 

object Endpoint “node2” {
host = “icinga_node_2”
port = 5665
}

object Endpoint “nodeZ” {
host = “icinga_node_z”
port = 5665
}

So your icinga config could look like this:

object Zone "master" {
endpoints = ["node1", "node2"]
}

object Zone "network-devices" {
endpoints = ["node3", "node4"]
parent "master"
}

object Zone "network-server_standard" {
endpoints = ["node5", "node6"]
parent "master"
}

object Zone "network-server_dmz" {
endpoints = ["node7", "node8"]
parent "master"
}

object Zone "network-storage" {
endpoints = ["node9", "node10"]
parent "master"
}

or a other possibility is locality your company has many locations on all continents and at some the internet connection is not very good. So your config could look like this:

object Zone "europe" {
endpoints = ["node3", "node4"]
parent "master"
}

object Zone "north_america" {
endpoints = ["node5", "node6"]
parent "master"
}

object Zone "south_america" {
endpoints = ["node6", "node7"]
parent "master"
}

So maybe it’s a good idea to name the zone like what it is checking. Only the Zone of the agent is called like the hostname.

Hope with this it’s more clear

I understand what you’re saying, but putting two endpoints into one zone does
not do what most people would expect, and putting three or more endpoints in a
zone is not allowed.


#endpoints says:

“All endpoints in the same zone work as high-availability setup. For example,
if you have two nodes in the master zone, they will load-balance the check
execution.”

Therefore two endpoints in one zone are not separate machines which are being
monitored by Icinga - they are load-balanced Icinga servers, sharing the
service checks of other machines between them.

“There is a known problem with >2 endpoints in a zone and a message routing
loop. The config validation will log a warning to let you know about this too.”

So, you can’t put more than 2 endpoints in one zone.

Most people would think of a zone named “network-devices” as being something
that contains all the switches and routers being monitored on their network,
but it isn’t. It’s something that contains one or possibly two, if you’re
doing HA, endpoints, and you need a separate zone for every switch or router
you want to monitor.

Every endpoint needs to go into into own zone - there’s no concept of grouping
similar devices (in terms of functionality, or location) into zones with 6
things in one zone and 10 things in another etc.

Antony.

Hi Antony,

thank you very much for your explanation!
So, my next question: is there some module in Icingaweb2 to have an overview of all Hosts with dependencies?
So where I can see, I have testmaster with testagentdebian under it, then testsatellite with testagentcentos under it and so on…

I think, it will help us a lot…

Thanks
Luca

Two satellite server in one zone are two diffent servers and of course they should also be checked. There are cluster checks and don’t forget the standard checks like disk, load etc.

The reason is that more than two nodes in in the same zone are syncing itself to death. Because they are constantly exchanging the data.

First of all it’s your own decision how you assign your devices to check to a zone. It could be a devicetyp, geographical location, network topology, security or what ever. So it’s the admins choice.
I only wrote two examples how it could look like. Because the question was:

This is something you should just think about before building an Icinga HA cluster. But maybe it’s also enough to have only one master and and maybe a addional satellite. That’s depending how big your infrastructure is. It makes sense to give the diffrent zones a meaningful name.

If you have installed the the director you can assign your host object to a zone in a easy way. And if you have two satellites in a zone then both agree on who checks what. This is done with a modulo procedure (loadbalcing). If one node is down the other is checking all until the other node is back again (failover).

If you put every satellite into their own zone you have neither loadbalancing nor failover. But if it’s a agent configuration you’re right.
And of course it makes not much sense if you have e.g. only 10 devices to check and for that you create a ha cluster. After satellite A is checking 5 and the other satellite B checks the other 5 devices. We for example have more than 3000 devices to check. The situation looks a little different then.

In summary and “the conecpt of grouping” we can say:
It depends how your infrastructure looks like. And the good thing to icinga: is very flexible for that kind of things.

I found this module https://github.com/visgence/icinga2-dependency-module a time ago. But I never got it to work. Maybe other community members did.

Hi Stevie

I found this module https://github.com/visgence/icinga2-dependency-module a time ago. But I never got it to work. Maybe other community members did.

Yes, I found it, too… and I didn’t got it working, too… :frowning:

Thanks
Luca