Hi,
the things @twidhalm explains with automation with chef and also the Icinga agent are totally valid, but still overwhelming imho. At first glance, you need to understand the differences between legacy (Nagios) and the current Icinga DSL format, especially in the way of passing command arguments.
There’s a couple of pre-defined command templates available in the ITL, so you don’t need to waste extra time with defining things others have done multiple times already. One of them is nrpe, amongst others like disk, load and community contributed definitions for famous check plugins like nwc_health or mysql_health.
I would assume that your current setup still runs while you’re migrating to Icinga, and as such, the NRPE daemon on the agent is still running.
For the first success with Icinga, I’d just implement the service in the Icinga way and then you may consider modern and secure agent communication with Icinga (TLS, endpoints, trust relationship preventing MITM attacks).
Analysis and Migration
As you know already, nrpe is there, and following your CheckCommand
command_line $USER1$/check_nrpe -n -H $HOSTNAME$ -t 60 -c $ARG1$ -a ‘$ARG2$’
there’s some things to consider:
-
-n
disables SSL/TLS, which is totally bad. For migration purposes, the nrpe
command requires the nrpe_no_ssl
attribute set to true
-
-t 60
sets a hard timeout to 60s, this needs the nrpe_timeout
set to 60s
.
-
-c
translates into nrpe_command
which is passed as $ARG1$
in the command line
-
-a
adds additional parameters on the command line (which must be enabled on the agent host)
The service itself adds the values for the two $ARGn$
macros in the old world.
check_command check-ips!common_check_ip!-i 205.189.10.21,205.189.10.22
The above typically is bad practice, as it allows the caller to pass any shell command with -a
and the remote end executes that. Compromised masters may abuse this to read secrets from the attacked host.
Anyhow, we know that the nrpe_arguments
in Icinga pass -a
as arguments, thus requiring an array.
vars.nrpe_arguments = [ "-i", "205.189.10.21,205.189.10.22" ]
In addition to that, the remote called nrpe command is just a string.
vars.nrpe_command = "common_check_ip"
Configuration
Do this outside of the Director for the first impression, and then move the learned bits and attributes into the Director. Later use the Director to manage an Icinga agent only. Follow the best practices for hosts already, making a transition to an agent easy.
The host object just needs the typical configuration settings, and you can use the template logic to e.g. store additional facts about the host itself.
template Host "linux-server-ubuntu" {
vars.os_type = "Linux"
vars.os_distribution = "Ubuntu"
}
object Host "linux-host-01.fqdn.com" {
import "linux-server-ubuntu"
address = "..."
vars.os_distribution_release = "18.04 LTS"
vars.extra_ip_checks = true
}
There’s one extra thing above - extra_ip_checks
to the apply rule below as filter.
Now, define a service apply rule which implements the previous check-IPs
definition.
apply Service "IP Checks" {
check_command = "nrpe"
vars.nrpe_command = "common_check_ip"
vars.nrpe_arguments = [ "-i", "205.189.10.21,205.189.10.22" ]
assign where host.vars.os_type == "Linux" && host.vars.extra_ip_checks == true // define your own rule/strategy where to generate a service object here
}
In the first iteration, this configuration should work and present something on the web interface. It is far from perfect though, since the service apply rule has the IP addresses hardcoded. One could move this line into the host object.
Still, requiring the admin to consider writing -i
up front the argument is error prone. You should hide that inside the CheckCommand definition in the future.
Icinga Agent + Director
This requires to setup the master and then the agent with Icinga, following the docs e.g. for a Linux host.
To help you further, we need the definition of common_check_ip
in your nrpe.cfg to actually build and use the proper CheckCommand definition.
Cheers,
Michael