I’ve got some custom checks that I need to write, involving SNMPv3 data retrievals from custom hardware devices.
One of the core requirements is that I need to compare current results to previous results.
What I’m wondering is if there are any established “best practices” around this - or if there are some known caveats that I should probably be aware of.
Most of the check code already exists. What’s happening at this time is that I’m trying to integrate this into our Icinga2 environment.
(Note, these checks are written in Python - the amount of time required to retrieve the necessary data so greatly exceeds the difference in execution time of the check between Python and C that it’s not worth it for me to spend the extra time to convert my code.)
So far, the only information I’ve found includes:
https://www.monitoring-plugins.org/doc/index.html, which leads me to
https://www.monitoring-plugins.org/doc/faq/private-c-api.html#state-information - all of which appear to really only apply to the monitoring-plugins package.
I’m not currently finding much anywhere else - are there other resources I’m missing? (I think I took a pretty good shot at trying to find answers about this here, but without any success. Searching for “state retention” only brings up some Graphite-related questions, and I couldn’t think of anything else to search for that looked like it put me on the right track.)
It seems to me (from what I’m not seeing) that there really aren’t any commonly endorsed methods. I should be able to define, allocate, and configure for a particular directory, create a directory structure within it like
<test_name>/<host>/data-<x> and go on my merry way.
- I know I need to lock the files to prevent concurrent execution, no problem. (I also know I should be able to timeout on the lock request after 10 seconds. If I can’t get to the file in that period of time, there’s something else really wrong.)
- I know I need to ensure the tests fail gracefully. If for any reason the file can’t be opened, or the current data in the file “doesn’t make sense”, or anything else along those lines, I need to return a status of Unknown.
All advice is welcome, thanks.