Problem migrating a check from Nagios to Icinga2: Handover $ARG1$

Hello,

we are working on monitoring a Fujitsu Eternus storage with Icinga2. Fujitsu offers a compiled python plugin for Nagios together with a documention for Nagios.

The command definition for Nagios looks like this:

command_line /usr/bin/python $USER1$/check_fujitsu_eternus_dx.pyc
 --host=$HOSTADDRESS$ --user=$_HOSTETERNUS_USER$
 --verbose=$_HOSTETERNUS_OPTIONS$  $ARG1$

For Icinga2 we have defined the following check:

object CheckCommand "eternus"  {
    command = [
            Python + PluginDir + "/check_fujitsu_eternus_dx.pyc "
            ]
    arguments = {
            "--host" = {
                    value = "$hostaddress$"
                    }
            "--user" = {
                    value = "$user$"
                    }
            "--verbose" = {
                    value = "$options$"
                    description = "1 - general mode, 2 - detail mode"
                    }
            }
    vars.args = ""

}

object Host "DX100-25" {
    address = "10.74.194.25"
    check_command = "hostalive"
    vars.notification["mail"] = {
            groups = ["icingaadmins"]
    }
}

object Service "System"{
    host_name = "DX100-25"
    check_command = "eternus"
    vars.user="icinga1"
    vars.options="2"
    vars.args="--chkraids"
}

But the check fails:

[2019-07-08 14:20:48 +0000] warning/PluginCheckTask: Check command for object 'DX100- 25!System' (PID: 11988, arguments: '/usr/bin/python2.7 
/usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc ' '--user' 'icinga
1' '--verbose' '2') terminated with exit code 128, output: execvpe(/usr/bin/python2.7 
/usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc ) failed: No such file or directory

When I execute the check on the command line, it finishes as expected:

/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc --host=10.74.194.25 -- user=icinga1 --verbose=2 --chkraids
RAIDS OK - 8 raid-groups found, freespace 0(MB), totalspace 71671656(MB)

The problem is that I have no idea how to define the command check to correctly handover the value for “–chkraids”.

Thanks for any hints or suggestions,

Stefan

Hi,

I think that the checkraids option should be inside the checkCommand object, as an argument, after or even before verbose. Then enable the variable either in the object or the host object.

Regards,
George

Hi,

guess I have spent too much time on this: The problem is, that the plugin requires the “=” sign between key and value and not a space:

--host=10.74.194.25

But

arguments = {
                "--host=" = {
                        value = "$hostaddress$"
                        }

doesn’t seem to be the solution:

/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc ' '--host='  '10.74.194.25' ....

Regards,

Stefan

Shouldn’t that be $host.address$ ?

Hi,

even if I remove every argument and define the command like this:

object CheckCommand "eternus-raid"  {
    command = [
            Python + PluginDir + "/check_fujitsu_eternus_dx.pyc --host=10.74.194.25 --user=icinga1 -- 
 verbose=1 --chkraids"
            ]
 }

debug.log tells me

[2019-07-08 17:09:56 +0000] warning/PluginCheckTask: Check command for object 'DX100-25!System' (PID: 29233, arguments: '/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc --host=10.74.194.25 --user=icinga1 --verbose=1 --chkraids') terminated with exit code 128, output: execvpe(/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc --host=10.74.194.25 --user=icinga1 --verbose=1 --chkraids) failed: No such file or directory

Regards,

Stefan

Last question before bedtime and then we continue tomorrow:

What is that Python + PluginDir ?

That just glues together python and the plugin directory. No wonder it says command not found for ‘/usr/bin/python2.7/usr/lib/nagios/etc’

Hi,

here is my /etc/icinga2/constants.conf

.....
const PluginDir = "/usr/lib/nagios/plugins"
const Python = "/usr/bin/python2.7 "
.....

There is a space after "2.7 " - I fixed that mistake already hours ago :wink:

Thank you for your help!

Regards,

Stefan

Hi,

the problem with the command path is that it contains multiple elements, you cannot simply add a space in the middle of strings. Icinga takes care of shell escaping and as such, /usr/bin/python2.7 /usr/lib/nagios/plugins/check_... is treated as a whole string resulting in '/usr/bin/python2.7 /usr/lib/nagios/plugins/check_...'.

execvpe() takes that as a first argument, whose lookup fails. That’s what the error says. With providing two argument items, it will work again.

Base Command

I would also recommend to rename Python into PythonBin to give it a more descriptive name.

const PythonBin = "/usr/bin/python2.7"

Note the array elements separated with the comma.

  command = [ PythonBin, PluginDir + "/check_fujitsu_eternus_dx.pyc"]

Arguments

The command arguments need to be put into the arguments attribute, just like you had if before. I would recommend to follow the guidelines for integrating a new CheckCommand object though. This includes using the command’s name as parameter prefix.

Also, extract the parameter description from the plugin, and add it next to value. That allows to identify the parameter easier later on.

    arguments = {
            "--host" = {
                    value = "$fujitsu_eternus_dx_address$"
            }
            "--user" = {
                    value = "$fujitsu_eternus_dx_user$"
            }
            "--verbose" = {
                    value = "$fujitsu_eternus_dx_options$"
                    description = "1 - general mode, 2 - detail mode"
            }
     }

Btw - I highly recommend to close the brackets in the same tab indent where the attribute is defined. Nagios used a different style guide with a tab indent there, which is sort of creepy and hard to read.

ARG1?

The remaining question is what you would pass into ARG1 from the old monitoring world. Is that a free form list of parameters, which additional one’s are possible for the plugin?

Found it, next time please add an URL to it.
https://www.fujitsu.com/global/support/products/computing/storage/download/nagios-plugin.html

The manual also says that --user2 is available, add that to the generic CheckCommand object.

            "--user2" = {
                    value = "$fujitsu_eternus_dx_user2$"
            }

Looking at page 25, the plugin developers added quite a few more parameters to the plugin, but were too lazy to write command definitions and docs explaining each parameter again. I would’ve expected this in the base table for the command parameters, but anyways.

Extra Arguments

Instead of putting in a list of extra arguments, I would modify that into optional arguments you can define.

The list in the PDF has:

  • --warning
  • --critical
  • Various check types prefixed with --chk
  • --uom
  • --performance
  • --remoteboxid

The different check types are sort of a problem, since they are hardcoded into the parameter key, not sent as extra value.

I see two options:

  • Specify the type as String in the service and automatically “generate” the parameter based on this.
  • Add boolean parameters with set_if to only set them on-demand when specified in the service

For simplicity I would recommend the second option.

object CheckCommand "fujitsu_eternus_dx" {
    command = [ PythonBin, PluginDir + "/check_fujitsu_eternus_dx.pyc"]

    arguments = {
            "--host" = {
                    value = "$fujitsu_eternus_dx_address$"
            }
            "--user" = {
                    value = "$fujitsu_eternus_dx_user$"
            }
            "--user2" = {
                    value = "$fujitsu_eternus_dx_user2$"
            }
            "--verbose" = {
                    value = "$fujitsu_eternus_dx_options$"
                    description = "1 - general mode, 2 - detail mode"
            }
            "--warning" = {
                    value = "$fujitsu_eternus_dx_warning$"
                    description = "Warning threshold"
            }
            "--critical" = {
                    value = "$fujitsu_eternus_dx_critical$"
                    description = "Critical threshold"
            }
            "--performance" = {
                    value = "$fujitsu_eternus_dx_performance$"
                    description = "Performance mode, available options: P" //TODO: Add more options
            }
            "--uom" = {
                    value = "$fujitsu_eternus_dx_uom$"
                    description = "Set the required unit of measurement for performance data metrics."
            }

            //Add all available modes and set the parameters whenever enabled in the service.
            "--chkvolumes" = {
                    set_if = "$fujitsu_eternus_dx_chkvolumes$"
                    description = ""
            }
            "--chkraids" = {
                    set_if = "$fujitsu_eternus_dx_chkraids$"
                    description = ""
            }
            "--chkrecpaths" = {
                    set_if = "$fujitsu_eternus_dx_chkrecpaths$"
                    description = ""
            }

           //TODO for you: Add more chk parameters based on the docs. 


            // Special handling for remoteboxid - only add the parameter with the value when the specific command type is used
            "--remoteboxid" = {
                    value = "$fujitsu_eternus_dx_remoteboxid$"
                    description = ""
                    set_if = "$fujitsu_eternus_dx_chkrecpaths$"
            }
     }
}

A typical Service definition is rather short and readable, without those parameter hacks after the ! character. Typically errors in check execution source from free form arguments, that’s why Icinga 2 follows a different approach with generic check command objects.

apply Service "eternus-raids" {
  check_command = "fujitsu_eternus_dx"

  vars.fujitsu_eternus_dx_chkraids = true
  vars.fujitsu_eternus_dx_warning = "N1"
  vars.fujitsu_eternus_dx_critical = "M1"
  vars.fujitsu_eternus_dx_performance = "P"

  assign where host.vars.vendor == "fujitsu" && host.vars.hw_type == "eternus"
}

The rest is more or less up to you. Write the CheckCommand just one time in a good generic way, and always have readable parameter definitions. You can also hide these parameters into service templates if you want.

Cheers,
Michael

1 Like

Hello Michael,

WOW, thanks a lot for this extensive explanation which really made a number of aspects much clearer for me.

But even after implementing your code, the check fails:

## Plugin-Output
error : 202 : Host or IP is not specified.
['--chkraids', '--host', '10.74.194.25', '--user', 'icinga1', '--verbose', '1']

As you can see, all necessary parameters are there, but I guess Icinga executes the command like this:

/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc ---chkraids --host 10.74.194.25 --user icinga1 --verbose1 

But the plugins needs the equal sign between parameter and value:

/usr/bin/python2.7 /usr/lib/nagios/plugins/check_fujitsu_eternus_dx.pyc --chkraids --host=10.74.194.25 --user=icinga1 --verbose=1

Regards,

Stefan

Hi,

hm, I am not sure about this, since typically when you know Python, you’re using the argparser class. Since that code is pre-compiled closed source foobar, hard to guess.

I would try a different thing - it could also be that the parameter list is order dependent, e.g. that --host is required being set as the first parameter. Such things should be tested on the CLI first though.

If it really is the assignment operator for key-value pairs, well, then you have multiple options:

  • Fix the plugin (likely not since the sources are not available, or maybe only to customers)
  • Wrap the arguments passed to the plugin and replace key-value pairs with = (Icinga doesn’t have that capability).

Anyways, I’d consider looking into alternative plugins or writing your own as well. Likely you can reverse engineer a lot from capturing the traffic with tcpdump when the plugin is run.

Cheers,
Michael

PS: Closed source sucks, another example.

1 Like

Hi,

I will write a wrapper for this plugin and post it on exchange.icinga.com.

Thanks again for your help,

Stefan

3 Likes

Hi,

@smguenther : Have you written the wrapper? I have exacly the same problem with this Plugin.

Best regards,
Rafael

Hi again,

found a solution without a wrapper:

object CheckCommand "fujitsu_eternus" {
    import "plugin-check-command"
    command = [ PluginContribDir + "/check_fujitsu_eternus_dx.pyc" ]
    timeout = 2m
    arguments += {
        "--check" = {
            skip_key = true
            value = "--$eternus_check$"
        }
        "--critical" = {
            skip_key = true
            value = "--critical=$crit$"
        }
        "--performance" = {
            skip_key = true
            value = "--performance=1"
        }
        "--user" = {
            skip_key = true
            value = "--user=$user$"
        }
        "--warning" = {
            required = true
            skip_key = true
            value = "--warning=$warn$"
        }
        "-H" = {
            required = true
            skip_key = true
            value = "--host=$host.address$"
        }
    }
}

But now i have this one:

DISK-BusyRate UNKNOWN - error : 290 : Internal error occurred.[ssh return code = 255]

Pseudo-terminal will not be allocated because stdin is not a terminal. Host key verification failed.

No idea to workaround this. Think that should be a problem even in Nagios.

Edit: Found it.

Sometimes a problem is so easy to solve, ,that noone is talking about it :wink:
Just store the remote host in known_hosts or login to the server once from the console with the icinga user