Inheritance and Dependency on my Infrastructure

Hello Community
I hope you can help me with my current Icinga challenge. I put together a visualization to
help with the big picture as well as the challenge I face. Maybe one of you has some ideas
how it could be possible.

I want to monitor the pictured infrastructure in Icinga. But I came across a
problem I’m not sure how to solve it and I hope with the help of the community I
will be able to do so.
Maybe this helps others as well. :wink:
Test scenario: Let’s assume I have a classical DC HA setup:

  • Spine-Leaf network architecture
  • VMware cluster with many hosts
  • Shared and local storage
  • Single instance business services as one VM or container
  • Business services that can be balanced over many ESXi hosts as VMs
  • Business services that can be spread out and balanced over many containers
    on VMs, running on one or more ESXi hosts…
    A normal setup for an enterprise.

A short explanation for the pictured infrastructure:
I simplified it here a bit. For this example I have two cases I would like
to tackle.
A bit more details: data-spines are coming as a lose set of switches but act as
a HA group. At least one switch needs to be up. Data-leafs are always combined
as a pair of switches to
provide a redundant connection that supports e.g. LACP mode. On the right is the
legend which explains the different objects and connections.
[Example 1]
The following scenario, which explains the monitoring situation for a faulty
situation, is shown on the right. “Data-Leaf 3” has a Problem (a specific defined
amount of services
are critical so the HOST is down). Now this has an impact on multiple other Hosts:

  1. The “Data-leaf Pair 3/4” should be in WARNING state because the redundancy is
    not guaranteed anymore. So I want the “data-leaf-pair 3/4” to be in a warning
    state (not down because Data-leaf 3 works completely fine). My problem here is a
    HOST can’t be in “warning state”, so I thought maybe it is possible to realize
    it with a SERVICE, but how can I request other HOST state or other SERVICE states
    and use them in this SERVICE?
    I know I am able to query for a Host_state but somehow not for a service state?
    Or did I miss something?
    [Example 2]
  2. Since the “Data-leaf pair 3/4” is in WARNING state, it should inform/distribute
    his state (Triangle) to the dependent child HOSTs (XEN #1 Server, VM-B1, VM-B2,
    VM-Bn). So that means all dependent HOSTs need to be in the same state as the
    “parent” HOST – in this case “Data-leaf pair 3/4" for the XEN server and the XEN
    server for the VMs. My question on this example now is: Is it possible to add this
    kind of “state” to an object? Does it maybe make sense to realize this with an
    additional SERVICE on each HOST? For example as a service called “Host-state”
    which shows the state of the current Host (Warning, critical, Ok, unknown)? Or
    is there another solution.

I would appreciate if the community has some hints or ideas to maybe realize it
in a better or additional way.

Thank you in advance, I’m looking forward to work on this problem maybe with
someone else who has already tried something similar.

Hello

I have encountered your dilemma at another installation where the “hostalive” base test was inappropriate for the testing and business review they needed.

The solution’s that was implemented was as follows:
Hosts were mostly a “null” container , meaning it held the name IP and vars but was not tested as a “host” object - we used the dummy test with ‘always up’ result.
To ensure proper testing and network awareness, each device was pinged and we had dependencies on that ping service propagated to the rest of the services on a particular host and then other hosts depend on these results, so we no longer had the “Parent - child” connection, but we had the dependencies granularity with the inheritance and the flexibility of the apply_rules.

Regards

1 Like

Thank you for your Hint, it seems like a good rudiment. How did you implement the dependencies on the ping service? I couldn’t find out so far how i can use the return of a service in another service?

regards

Check the documentation for the dependencies
https://icinga.com/docs/icinga2/latest/doc/03-monitoring-basics/#dependencies