Check_http random socket timeout

I monitor some websites with the check_http plugin. This works quite well so far. However, I get a critical alarm at irregular intervals: CRITICAL - Socket timeout after 10 seconds.

I have looked at the access logs on the web server and cannot find any entries for the time when the alarm was triggered.

Therefore, I assume that there is already a problem on the monitoring server, which means that no request can be sent to the website.

How could I debug this?

Icinga is running on an Ubuntu 20.04 server in version v2.12.5-1. The configuration was done with the Icinga Director.

The service configuration looks like:

template Service "_default" {
    max_check_attempts = "3"
    check_interval = 1m
    retry_interval = 30s
    check_timeout = 30s
    enable_notifications = true
    enable_active_checks = true
    enable_passive_checks = true
    enable_event_handler = false
    enable_flapping = false
    enable_perfdata = true
}

template Service "http" {
    import "_default"

    check_command = "http"
    vars.http_critical_time = "5"
    vars.http_ignore_body = true
    vars.http_onredirect = "follow"
    vars.http_ssl = true
    vars.http_vhost = "$host.address$"
    vars.http_warn_time = "2"
}

Thank you in advance! :slight_smile:

You can enable debug log to get more details. And you can manually run the plugin with adding --verbose.

It would be useful to know a bit more about your Icinga network.

Do you have:

a) just a single Icinga server performing all monitoring checks?

b) a Master with one or more Agents?

c) a Master with one or more Satellites, and one or more Agents?

Finally, out of the above descriptions, which Icinga server is performing the
http service check on your web server?

Thanks,

Antony.

I have 1 Icinga master which performans the monitoring checks for systems, where it’s not possible to install the Icinga Agent, like websites, printer, switches etc… On the servers (16) the Icinga Agent is running.

The http service check is running on the Icinga master.

Best regards
Stefan

Thank you! I will try the debug log functionality. :slight_smile:

Best regards
Stefan

Here is an excerpt from the debug.log:

[2021-08-04 14:41:28 +0200] notice/Process: PID 584506 ('/usr/lib/nagios/plugins/check_http' '--no-body' '-H' 'www.mydomain.tld' '-I' 'www.mydomain.tld' '-S' '-c' '5' '-f' 'follow' '-w' '2') terminated with exit code 2
[2021-08-04 14:41:28 +0200] notice/Dependency: Dependency 'mydomain.tld!internet-connection' passed: Parent host '_internet-connection' matches state filter.
[2021-08-04 14:41:28 +0200] notice/Dependency: Dependency 'mydomain.tld!http!host' passed: Parent host 'mydomain.tld' matches state filter.
[2021-08-04 14:41:28 +0200] notice/Dependency: Dependency 'mydomain.tld!internet-connection' passed: Parent host '_internet-connection' matches state filter.
[2021-08-04 14:41:28 +0200] notice/Dependency: Dependency 'mydomain.tld!http!host' passed: Parent host 'mydomain.tld' matches state filter.
[2021-08-04 14:41:28 +0200] debug/Checkable: Update checkable 'mydomain.tld!http' with check interval '60' from last check time at 2021-08-04 14:41:28 +0200 (1.62808e+09) to next check time at 2021-08-04 14:41:56 +0200 (1.62808e+
09).
[2021-08-04 14:41:28 +0200] notice/ApiListener: Relaying 'event::SetNextCheck' message
[2021-08-04 14:41:28 +0200] notice/Checkable: State Change: Checkable 'mydomain.tld!http' soft state change from OK to CRITICAL detected.
[2021-08-04 14:41:28 +0200] notice/ApiListener: Relaying 'event::CheckResult' message

Unfortunately, that doesn’t help me either.

What is the result of this command?

/usr/lib/nagios/plugins/check_http --no-body -H www.mydomain.tld -I www.mydomain.tld -S -c 5 -f follow -w 2 --verbose

I’m not sure whether -H and -I works together.