Incorrect values in all checks after updating to 2.13.1

Robby · September 21, 2021, 1:12pm

After updating to 2.13.1 all disk checks, memory checks, load checks dropped about 10%. It doesn’t matter whether it’s a linux or windows system. It seems to be a problem on interpreting the perfdata. A disk which has 80 GB shows in Grafana only as 72 GB and also all thresholds.

Before updating we ran version 2.12.5. Has there something been changed what could cause this? In the Changelog I couldn’t find an answer to this.

Kind regards
Robert

dgoetz · September 21, 2021, 2:26pm

2.13 introduced support for new units of measurement: Service Monitoring - Icinga 2

So it is likely that disk and memory checks have now changed if something was using the incorrect conversion (GB vs. GiB), but changed values for load will have another reason.

Matlib · September 25, 2021, 4:36pm

Monitoring plugins report sizes as powers of 2. The default unit for check_disk is megabytes, so if icinga started dividing that by 1048576 for the second time then it will simply show wrong values.

# /usr/local/libexec/nagios/check_disk -w 80% -c 90% -u bytes /
DISK CRITICAL - free space: / 4845006848 B (25% inode=93%);| /=14286000128B;-2147483648;2079456870;0;20794568704
# /usr/local/libexec/nagios/check_disk -w 80% -c 90% /
DISK CRITICAL - free space: / 4620 MB (25% inode=93%);| /=13624MB;3966;1983;0;19831

dgoetz · September 27, 2021, 6:59am

Yes, but in this case, the monitoring plugin would be wrong as powers of 2 are not MB but MiB. The nagios plugins fixed this a while ago (2.3.0) and changed later the default (2.3.2). For the monitoring plugins a issue is open from 2017 about a mismatch here.

Matlib · October 2, 2021, 2:04pm

The monitoring plugins spec comes from mid-1990s when the “MiB” nonsense hadn’t been invented yet. A lot of accompanying software follows this.

The obvious solution is to always use --unit bytes (or equivalent for different plugins) to work around the confusion. I’d suggest adding a bold notice in the documentation.

leeclemens · October 23, 2021, 12:04am

Seems like there’s a bug with the Nagios plugin’s perfdata. Using --units bytes:

Raspbian GNU/Linux 10 (buster)

check_disk v2.2

DISK OK - free space: / 45327069184 B (75% inode=91%); /boot 213937664 B (80% inode=-);
| /=2147483647B;2147483647;2147483647;0;2147483647 /boot=50351616B;224645888;237860352;0;264289280

check_disk v2.3.3.3.g82e5

DISK OK - free space: / 45327085568 B (75.35% inode=91%); /boot 213937664 B (80.94% inode=-);
| /=1937420288B;;;0;-1693577216 /boot=50351616B;224645888;237860352;0;264289280

Both are wrong, but in different ways.

For comparison, with --units MiB:

DISK OK - free space: / 43227 MiB (75.35% inode=91%); /boot 204 MiB (80.94% inode=-);
| /=14135MiB;50850;53841;0;59824 /boot=48MiB;214;226;0;252

leeclemens · October 23, 2021, 12:25am

CentOS 7 seems fine:

CentOS Linux 7 (Core)
check_disk v2.3.3

DISK OK - free space: / 29723013120 B (36.91% inode=98%);
| /=50796068864B;68441219686;72467173785;0;80519081984

leeclemens · October 23, 2021, 6:12pm

You may want to file a bug report with those folks. Or simply use a more suitable unit for your disks.

Al2Klimov · March 12, 2024, 11:37am

FWIW, in Icinga 2.14 we’ve dropped the MB-default:

https://github.com/Icinga/icinga2/pull/9642

Now it’s the plugin’s responsibility to choose a sane unit.