Our customer TAC does not act on any unknown (or warning) for ALL customers, so our approach is a little different.
In your description, it looks like you’re on the right path, but I would ensure you have all of the following:
- Service Group via apply rule that is for disabled unknowns
- Host Group via apply rule that is for disabled unknowns
- Notification Apply rule that disables Unknown alerts for hosts/services (you may use a notification template as you have above to import for this rule)
Unfortunately, we use the director since we automate the import, group assignment, and service assignment for ~25k services (nevermind the host count), so I don’t have a good .conf
file example on doing the assignments.
Basically you want to first create a notification template that does not include the unknown state. From there, you want to create both a service group and a host group and assign said hosts/services to the group with the method of your choosing (again, custom variables are the way to go, but manual assignment works if your environment is tiny but if your environment is small or larger, then it’s time to use custom variables.
After this, create your notification apply rule for excluding unknowns, import the template to the rule, and assign where the group name is linux_host_disable_unknown and linux_services_disable_unknown. You will have a second notification apply rule that applies to either:
- hosts/services NOT in the disable group
- hosts/services that are in an ENABLE group
Just looking at your config, it looks like your most of the way there, or possibly all of the way there, I would just do a sanity check on everything.
assign where host.name == "host1" || match("host2*", host.name) || host.name == "host3" || host.name == "host4" || host.name == "host5"
states = [ Down, Up ]
types = [ Custom, Problem, Recovery ]
users = [ "cust.linux.support" ]
Using the service/host groups will allow you to not have to filter via the hostname – you can simply choose your own path to get them into the group (ie, using a custom variable, or just manually entering in the name into the host/service groups)
After all of this is sorted out, you can take it a step further in Icingaweb by introducing a custom menu option in /etc/icingaweb2/navigation/menu.ini
that filters out all of these as well.
You can do something that’s like “show all unhandled customer problems” that filters out unknowns for the service group, but keeps the unknowns for the others. Here is a slightly modified example that we use live in our environment:
# THERE COULD BE TYPOS -- this is meant for you to modify to your needs
# First we have to create the top level menu option
[CTAC]
name = "CTAC"
users = "yes" # any icingaweb users
groups = "also_yes" # any icingaweb groups, note: not notification groups
type = "menu-item"
icon = "error.png"
priority = "1" # This places the menu option at the very top of the right-hand pane
# Next we can add in the link that filters out what we don't want
# This particular one is only for services, but can be modified to include hosts, or you can use another menu option for hosts
[CTAC-Unhandled Services Not Unknown]
name = "Unhandled Critical Services"
users = "yes" # any icingaweb users
groups = "also_yes" # any icingaweb groups, note: not notification groups
type = "menu-item"
target = "_main" # this opens the link up in the same tab
url = "monitoring/list/services?service_in_downtime=False&service_acknowledged=0&(service_state=2)&modifyFilter=0&limit=100&sort=host_display_name&service_hard_state=2&((hostgroup=Linux_host_disable_unknown)|(hostgroup=Linux_host_enable_unknown))"
owner = "root"
Xicon = "error.png"
priority = "1" # First option
parent = "CTAC" # Assigns this sub option to the CTAC option defined above
# Last, we can add in the link that filters out what we don't want. Unfortunately, I don't know a good way off hand, but you can probably do something to the likes of "include hosts/services in the enable group | services that are in critical/warning
# This particular one is only for services, but can be modified to include hosts, or you can use another menu option for hosts
[CTAC-Unhandled Services]
name = "Unhandled Critical Services"
users = "yes" # any icingaweb users
groups = "also_yes" # any icingaweb groups, note: not notification groups
type = "menu-item"
target = "_main" # this opens the link up in the same tab
url = "monitoring/list/services?service_in_downtime=False&service_acknowledged=0&(service_state=2|service_state=3)&modifyFilter=0&limit=100&sort=host_display_name&service_hard_state=2&((hostgroup=Linux_host_disable_unknown)|(hostgroup=Linux_host_enable_unknown))"
owner = "root"
Xicon = "error.png"
priority = "2" # Second option
parent = "CTAC" # Assigns this sub option to the CTAC option defined above