Issue: GraphiteWriter not writing Performance Data to Carbon-Cache

Hello together,

Icinga2 does not send Performance Data to Carbon-Cache.
Is there a known issue related to RHEL 7.4?

tcpdump -n port 2003
Does not show up any packets. Only the connection is established by Icinga.
Manual sending performance data to carbon-cache works as expected and will be shown up in the graphite web ui.

To verify it is not an issue with the custom checks I have added the localhost with a ping check.
But still the same issue.

The process performance data toggle is activated. Icinga is connected with carbon.

Last Check Result is not shown but the executed command.

Best Regards
ArMa

Your Environment

  • Version used (icinga2 --version):
    icinga2 - The Icinga 2 network monitoring daemon (version: r2.10.2-1)

Copyright © 2012-2018 Icinga Development Team ()
License GPLv2+: GNU GPL version 2 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
Platform: Red Hat Enterprise Linux Server
Platform version: 7.4 (Maipo)
Kernel: Linux
Kernel version: 3.10.0-693.17.1.el7.x86_64
Architecture: x86_64

Build information:
Compiler: GNU 4.8.5
Build host: unknown

Application information:

General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2

Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var

Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid

  • Enabled features (icinga2 feature list):
    Disabled features: compatlog elasticsearch gelf influxdb opentsdb perfdata statusdata syslog
    Enabled features: api checker command debuglog graphite ido-mysql livestatus mainlog notification

  • Icinga Web 2 version and modules (System - About):

  • Config validation (icinga2 daemon -C):
    [2019-02-22 10:36:24 +0100] information/cli: Icinga application loader (version: r2.10.2-1)
    [2019-02-22 10:36:24 +0100] information/cli: Loading configuration file(s).
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Committing config item(s).
    [2019-02-22 10:36:24 +0100] information/ApiListener: My API identity: xxx-xxxx.xxx-xx.com
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 2299 Services.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 LivestatusListener.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 737 Hosts.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 2 FileLoggers.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 2 NotificationCommands.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 6 HostGroups.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 ApiListener.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 3 Zones.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 ExternalCommandListener.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 2 Endpoints.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 3 ApiUsers.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 User.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 296 CheckCommands.
    [2019-02-22 10:36:24 +0100] information/ConfigItem: Instantiated 1 ServiceGroup.
    [2019-02-22 10:36:24 +0100] information/ScriptGlobal: Dumping variables to file ‘/var/cache/icinga2/icinga2.vars’
    [2019-02-22 10:36:24 +0100] information/cli: Finished validating the configuration file(s).

  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes.
    Object ‘xxxx-xxxx1.xxx-xx’ of type ‘Endpoint’:
    % declared in ‘/etc/icinga2/zones.conf’, lines 1:0-1:42

    • __name = “xxxx-xxxx1.xxx-xx”

    • host = “xx”
      % = modified in ‘/etc/icinga2/zones.conf’, lines 2:3-2:23

    • log_duration = 86400

    • name = “xxxx-xxxx1.xxx-xx”

    • package = “_etc”

    • port = “5665”

    • source_location

      • first_column = 0
      • first_line = 1
      • last_column = 42
      • last_line = 1
      • path = “/etc/icinga2/zones.conf”
    • templates = [ “xxxx-xxxx1.xxx-xx” ]
      % = modified in ‘/etc/icinga2/zones.conf’, lines 1:0-1:42

    • type = “Endpoint”

    • zone = “”

Object ‘xxxx-xxxx2.xxx-xx’ of type ‘Endpoint’:
% declared in ‘/etc/icinga2/zones.conf’, lines 5:1-5:43

  • __name = “xxxx-xxxx2.xxx-xx”

  • host = “xx”
    % = modified in ‘/etc/icinga2/zones.conf’, lines 6:3-6:23

  • log_duration = 86400

  • name = “xxxx-xxxx2.xxx-xx”

  • package = “_etc”

  • port = “5665”

  • source_location

    • first_column = 1
    • first_line = 5
    • last_column = 43
    • last_line = 5
    • path = “/etc/icinga2/zones.conf”
  • templates = [ “xxxx-xxxx2.xxx-xx” ]
    % = modified in ‘/etc/icinga2/zones.conf’, lines 5:1-5:43

  • type = “Endpoint”

  • zone = “”

Object ‘master’ of type ‘Zone’:
% declared in ‘/etc/icinga2/zones.conf’, lines 9:1-9:20

  • __name = “master”

  • endpoints = [ “xxxx-xxxx1.xxx-xx”, “xxxx-xxxx2.xxx-xx” ]
    % = modified in ‘/etc/icinga2/zones.conf’, lines 10:3-10:74

  • global = false

  • name = “master”

  • package = “_etc”

  • parent = “”

  • source_location

    • first_column = 1
    • first_line = 9
    • last_column = 20
    • last_line = 9
    • path = “/etc/icinga2/zones.conf”
  • templates = [ “master” ]
    % = modified in ‘/etc/icinga2/zones.conf’, lines 9:1-9:20

  • type = “Zone”

  • zone = “”

Object ‘global-templates’ of type ‘Zone’:
% declared in ‘/etc/icinga2/zones.conf’, lines 14:1-14:30

  • __name = “global-templates”

  • endpoints = null

  • global = true
    % = modified in ‘/etc/icinga2/zones.conf’, lines 15:3-15:15

  • name = “global-templates”

  • package = “_etc”

  • parent = “”

  • source_location

    • first_column = 1
    • first_line = 14
    • last_column = 30
    • last_line = 14
    • path = “/etc/icinga2/zones.conf”
  • templates = [ “global-templates” ]
    % = modified in ‘/etc/icinga2/zones.conf’, lines 14:1-14:30

  • type = “Zone”

  • zone = “”

Object ‘director-global’ of type ‘Zone’:
% declared in ‘/etc/icinga2/zones.conf’, lines 18:1-18:29

  • __name = “director-global”

  • endpoints = null

  • global = true
    % = modified in ‘/etc/icinga2/zones.conf’, lines 19:3-19:15

  • name = “director-global”

  • package = “_etc”

  • parent = “”

  • source_location

    • first_column = 1
    • first_line = 18
    • last_column = 29
    • last_line = 18
    • path = “/etc/icinga2/zones.conf”
  • templates = [ “director-global” ]
    % = modified in ‘/etc/icinga2/zones.conf’, lines 18:1-18:29

  • type = “Zone”

  • zone = “”

Hi,

what happens if you manually force a re-check through the web interface?

Cheers,
Michael

Hi,

I forced a recheck via the Check now button and after ~10 seconds the check will be executed again an the timer will be resetted.

BR
ArMar

Hi,

the issue is only on this system and the other cluster node.
Two other icinga instances are running like they should with a similar configuration.
I have compared parts of the configuration files with those other icinga instances.

BR
ArMa

Maybe a screenshot can explain everything a bit better:

In the Icinga2 logs is the following entry which repeats every couple minutes:
[2019-02-22 15:51:58 +0100] information/WorkQueue: #7 (GraphiteWriter, graphite) items: 0, rate: 9.58333/s (575/min 3028/5min 9122/15min);

Both masterendpoints have the Graphite feature enabled - icinga2 feature list?
Also, show the output of tcpdump on both master nodes.

Cheers,
Michael

Hello Michael,

echo "local.random.diceroll 3 date +%s" | nc Master1-ip 2003
If I send this command form one of the nodes it will be displayed in the graphite web ui.
So carbon is working like it should.

The storage schema for carbon:

[carbon]
pattern = ^carbon.
retentions = 60:90d

[icinga]
pattern = .*
retentions = 60s:1d

I had to change the tcpdump to: tcpdump -i lo -n port 2003
Recorded dump with the echo:

14:01:54.273660 IP 127.0.0.1.39430 > 127.0.0.1.cfinger: Flags [S], seq 2761136951, win 43690, options [mss 65495,sackOK,TS val 1131523806 ecr 0,nop,wscale 7], length 0
03:00:47.685205 IP 127.0.0.1.cfinger > 127.0.0.1.39430: Flags [S.], seq 3904425610, ack 2761136952, win 43690, options [mss 65495,sackOK,TS val 1131523806 ecr 1131523806,nop,wscale 7], length 0
14:01:54.273689 IP 127.0.0.1.39430 > 127.0.0.1.cfinger: Flags [.], ack 1, win 342, options [nop,nop,TS val 1131523806 ecr 1131523806], length 0
14:01:54.273756 IP 127.0.0.1.39430 > 127.0.0.1.cfinger: Flags [P.], seq 1:36, ack 1, win 342, options [nop,nop,TS val 1131523806 ecr 1131523806], length 35
14:01:54.273762 IP 127.0.0.1.cfinger > 127.0.0.1.39430: Flags [.], ack 36, win 342, options [nop,nop,TS val 1131523806 ecr 1131523806], length 0
14:01:54.273776 IP 127.0.0.1.39430 > 127.0.0.1.cfinger: Flags [F.], seq 36, ack 1, win 342, options [nop,nop,TS val 1131523806 ecr 1131523806], length 0
14:01:54.274381 IP 127.0.0.1.cfinger > 127.0.0.1.39430: Flags [F.], seq 1, ack 37, win 342, options [nop,nop,TS val 1131523807 ecr 1131523806], length 0
14:01:54.274389 IP 127.0.0.1.39430 > 127.0.0.1.cfinger: Flags [.], ack 2, win 342, options [nop,nop,TS val 1131523807 ecr 1131523807], length 0
14:03:01.538124 IP 1xxxxxxx.48986 > xxxxxxx.cfinger: Flags [S], seq 3320827693, win 43690, options [mss 65495,sackOK,TS val 1131591070 ecr 0,nop,wscale 7], length 0
11:15:44.432669 IP xxxxxxx.cfinger > xxxxxxx.48986: Flags [S.], seq 693407851, ack 3320827694, win 43690, options [mss 65495,sackOK,TS val 1131591070 ecr 1131591070,nop,wscale 7], length 0
14:03:01.538166 IP xxxxxxx.48986 > xxxxxxx.cfinger: Flags [.], ack 1, win 342, options [nop,nop,TS val 1131591070 ecr 1131591070], length 0
14:03:01.538241 IP xxxxxxx.48986 > xxxxxxx.cfinger: Flags [P.], seq 1:36, ack 1, win 342, options [nop,nop,TS val 1131591070 ecr 1131591070], length 35
14:03:01.538252 IP xxxxxxx.cfinger > xxxxxxxx.48986: Flags [.], ack 36, win 342, options [nop,nop,TS val 1131591071 ecr 1131591070], length 0
14:03:01.538270 IP xxxxxxxx.48986 > xxxxxxxx.cfinger: Flags [F.], seq 36, ack 1, win 342, options [nop,nop,TS val 1131591071 ecr 1131591071], length 0
14:03:01.539010 IP xxxxxxxx.cfinger > xxxxxxxx.48986: Flags [F.], seq 1, ack 37, win 342, options [nop,nop,TS val 1131591071 ecr 1131591071], length 0
14:03:01.539031 IP xxxxxxxx.48986 > xxxxxxxxx.cfinger: Flags [.], ack 2, win 342, options [nop,nop,TS val 1131591071 ecr 1131591071], length 0

The second master can not send traffic to the carbon installation on the first master at the moment.

First master icinga2 feature list:

Disabled features: compatlog elasticsearch gelf influxdb opentsdb perfdata statusdata syslog
Enabled features: api checker command debuglog graphite ido-mysql livestatus mainlog notification

Second master icinga2 feature list

Disabled features: compatlog elasticsearch gelf influxdb opentsdb perfdata statusdata syslog
Enabled features: api checker command debuglog graphite ido-mysql livestatus mainlog notification

Connection between carbon cache and icinga2:

lsof -i -n -P | grep 2003
carbon-ca 7960 carbon 11u IPv4 208388503 0t0 TCP *:2003 (LISTEN)
carbon-ca 7960 carbon 19u IPv4 267160895 0t0 TCP xxxxxxx1:2003->xxxxxx2:59148 (ESTABLISHED)
carbon-ca 7960 carbon 21u IPv4 228488782 0t0 TCP 127.0.0.1:2003->127.0.0.1:52738 (ESTABLISHED)
icinga2 17443 icinga 19u IPv4 228505451 0t0 TCP 127.0.0.1:52738->127.0.0.1:2003 (ESTABLISHED)

Hello together,

the solution was to enable performance data via System -> Monitoring Health. There is an additional setting to activate the performance data.
Even if it is enabled on the checks itself it has to be activated.
Maybe this helps someone to solve this issue and save a ton of time.

@dnsmichi Thanks for trying to help me.

Best regards
ArMa

1 Like

Hi,

ok, very weird. By default, this is enabled - someone must have disabled this at runtime. Glad you’ve figured it out by yourself.

Cheers,
Michael