Linux Server Service Checks with Python

Hi all,

I am trying to find out how to add some Service Checks for my MGMT-VM which is an Ubuntu Server. Some templates for Windows are not available for Linux as I have found out, for example the Windows-Traffic Check. From what I have heard and read so far is, that I would have to write a Check from scratch on my own.

Some say Python would be the best way to go. But I do not know how to start, where to start or even how I would fetch the needed data and let it be displayed in ICINGA itself.

Does anyone have a Tip or a quickstart guide for these items?

With kind regards,
MusicBoy

Have a look at the @linuxfabrik monitoring-plugins. Maybe you can find the right checks or even sets of checks with templates or at least some inspiration on how to write your own. There are even guidelines that should help improving the quality of your self written plugins.

Icinga 2 executes check commands on the monitored server. These check commands do the actual check, returning back the check state, some text, and potentially performance metrics.

To be able to execute the check commands, Icinga 2 requires a CheckCommand object which defines which command should be executed and its parameters. However, lots of predefined CheckCommands are already shipped with an Icinga 2 installation as part of the Icinga Template Library.

This is further described in the Monitoring Basics, Commands documentation section. Further details are available in the Service Monitoring section. Please take your time reading these sections, also containing examples in Python. In parallel, there are lots good examples out there, as for example @rivad’s suggestion or on https://exchange.icinga.com/.

1 Like

The process I follow @MusicBoy is

  • Search for built it checks
  • Search for checks somebody else has built (google) including linuxfab or nagios exchange as well as tools like thola or nwc.

If none of these suit my needs I work out what command I (as a human) need to run to get the results I want. From here I can determine how complicated a check I’ll need.

The check itself needs to return output, a status code and performance data (optional).

For simple checks I will often just use bash which I have a template for. When I need lots of arguments and logic I’ll use python.

While linuxfab produces very nice checks my own python checks are based around a library sol1-monitoring-plugins-lib · PyPI which helps me manage the output Icinga needs, particularly when I’m checking multiple parts and returning the overall results along with the part that failed.

eg:

Warning - Alarm [2] is warning
OK: Alarm 1
Warning: Alarm 2 - reason it is warning for humans
OK: Alarm 3
OK: Alarm 4

It is the output of the plugin that creates the monitoring data for Icinga.

For reference here is a bash template I use which helps me return the right monitoring data, not as powerful as the python library I built but it is for simple checks so it doesn’t need to be.

#!/bin/bash
#

function trimMessage() {
        MESSAGE="$(echo -e "${MESSAGE}" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//' )"
}

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

exitCode=$STATE_UNKNOWN
MESSAGE=""

# make sure args exist
if [ -n "$1" ] ; then
        arg1="$1"
        # Add test logic here, eg:
        # result=`echo 1`
        # if [ "$arg1" == "$result" ]; then
        #        MESSAGE="Everything is OK"
        #        exitCode=$STATE_OK
        # fi
else
        exitCode=$STATE_UNKNOWN
        MESSAGE="Missing paramaters ${MESSAGE} ($@)"
fi

trimMessage

if [ $exitCode -eq $STATE_UNKNOWN ]; then
        echo "UNKNOWN - $MESSAGE"
elif [ $exitCode -eq $STATE_CRITICAL ]; then
        echo "CRITICAL - $MESSAGE"
elif [ $exitCode -eq $STATE_WARNING ]; then
        echo "WARNING - $MESSAGE"
elif [ $exitCode -eq $STATE_OK ]; then
        echo "OK - $MESSAGE"
fi

exit $exitCode
3 Likes

Thanks for promoting your Python library. I was not aware of its existence and I am going to check it out!

In the past, I have used the testinfra Python library, which allows writing unit tests with helper functions for services, system information, and the like. Its output format can mimic the Nagios plugin format, making it perfectly usable for Icinga.

Thank you for sharing!

BTW, it would be nice to include in standard templates:

  • Processing for standard arguments: -H and --hostname, -w and --warning, -c and --critical, -v and --verbose, -V and --version, -h and --help, --debug, etc.
  • Processing for standard threshold format ( Development Guidelines · Nagios Plugins)

I’m using argparse for arguments which has --help already, --debug is included with logging, --version is a excellent suggestion for inclusion.

In my experience the remaining switches may or may not be needed on a check by check basis so automatically having them wouldn’t be beneficial.

We have been playing around with extending the MonitoringPlugin class to include common tasks and parse of the standard threshold format but we’ve found trying to create a standard set of logical tasks, eg: in this range, to have a lot of variations in use where we need to write custom ones with minor variations.

For @MusicBoy or anybody else looking to write their own plugins I’d suggest avoiding complexity if possible, this is why my first choice for a linux plugin is bash, and only dip into using python or other languages or libraries (aka dependencies) if the complexity of the check requires it or makes the cost of the complexity worth it.

Our Python libraries are also available on PyPI to make writing plugins and handling everyday tasks such as Nagios ranges, human-readable numbers and table output (just to mention a few) as straightforward as possible.

2 Likes