Graphite not working after configuration of Icinga Reporting

Hello together,

i had a working icinga2 + graphite setup that was showing graphs in icinga2 and on the graphite web browser. on friday i installed the icinga reporting module following the installation instructions, including the IDO-reports module.

this morning i recognized that there are no graphs in icinga and on the graphite web anymore. the walue.wsp’s in the carbon/whisper directory are still working, but on graphite-web i only see a small icon which looks like a broken document in the upper left corner.

does anyone have an idea what could be happened and how to fix it?

any help is highly appreciated!

i checked the dependencies with

/usr/local/src/graphite-web/check-dependencies.py

it says the following:
[REQUIRED] Unable to import the ‘cairocffi’ module, attempting to fall back to pycairo
[REQUIRED] Unable to import the ‘cairo’ module, do you have pycairo installed for python 2.7.5?
[REQUIRED] You have django version 1.6.11.6 installed, but version 1.9 or greater is required

i wonder how graphite could work before, because i didnt change anything with this since it worked last friday?

i tried to install the pycairo module but it says:
Package pycairo-1.8.10-8.el7.x86_64 already installed and latest version

Hello,

normally the Reporting Module should not affect the graphite in any way.

What OS Version you are using ?

Because you said the whisper files are up to date this is a web only problem. Do you see anything in the apache access or error log when you try to reach the site ?

Greetings Epytir

Hello Epytir,

Errorlog says this:

[Mon Jul 29 09:40:50.681520 2019] [:error] [pid 42960] [client 10.151.65.14:52916] mod_wsgi (pid=42960): Exception occurred processing WSGI script ‘/usr/share/graphite/graphite-web.wsgi’., referer: http://192.168.253.122:8000/composer/?
[Mon Jul 29 09:40:50.681638 2019] [:error] [pid 42960] [client 10.151.65.14:52916] IOError: failed to write data, referer: http://192.168.253.122:8000/composer/?

Access-log has lots of entries every few seconds like this:

10.151.65.14 - - [29/Jul/2019:12:24:06 +0200] “GET /render?&target=icinga2..services.Import:_DokumentationDokumente.check-import.perfdata.Anzahl_Done-Dateien.value&source=0&width=300&height=120&hideAxes=false&lineWidth=2&hideLegend=true&colorList=049BAF&bgcolor=white&fgcolor=black HTTP/1.1” 500 811

OS Version:
CentOS Linux release 7.4.1708 (Core)

IcingaWeb Version:
2.4.2

Icinga Verison
r2.7.1-1

Thank you very much!

I’m noticing you have things in /usr/local/ and also just /usr, and you’re getting errors about the installed Django version. Did you install the old graphite that ships with CentOS7 and then attempt to install the new one from source?

1 Like

Ok so IOErrors can have several reasons.

  • How is your Hardware usage on the maschine (CPU,RAM and IO on Disk) when the system is overloaded the graphite will not work anymore.
    If you can maybe restart the maschine. Normally when this is a problem caused by performance this will fix it for some time so you can test.

If you have a virtual enviroment make a snapshot of the machine to prevent problems when trying to fix the problem.

  • Have you tried to install django from the repository
    yum install python-django
    maybe there is a newer version available for CentOS. (But I think CensOS uses python 2.x and the newer versions are for 3.x)

If you have pip available you can install the dependencies over pip
Or if you want you can install pip :

yum install python-pip
pip install cairocffi
pip install django

I hope this will help. You said it was running last week. If nothing was changed i suspect that the server has load issues.

Yes please check to not install the same software from different ways (manuell, repository, pip … )
This can result in some issues like Blake mentioned.

that could be, the system is runnning since 2-3 years i don’t really remember how exactly i set it up. what would that mean? how could i get rid of one version safely?

load on thm virtual machine is okay, it’s monitoring itself and i didnt get any alarms so i dont think thats it.

So, having scrapped the default Graphite on CentOS7 and manually installed the up to date one in the last year, they don’t have the same dependencies. If you start installing things from pip, it will probably break everything worse. EPEL ships the old versions of django and cairo that the old version of graphite depended on.

The script you have in /usr/local/src to verify dependencies is going to look for the dependencies for the newer version of graphite. If you haven’t already tried to install all that, there shouldn’t be a mess yet.

Does journalctl say anything about graphite web failing to start?

so, i just got my graphs back.

i compared the output from check_dependecies.py with another icinga machine which setup is identical. the onyl difference was that line:
[REQUIRED] Unable to import the ‘cairo’ module, do you have pycairo installed for python 2.7.5?

so i googled and installed cairo manually this way:
yum install cairo-devel

after that i restarted my httpd manually and they were back.

journalctl didn’t say anything about graphite_web. if i understand you correctly what i did wasn’t recommended by you (didn’t see it before). do you have any further hints for my configuration?

The systemd unit is just called graphite, you ideally get something that just says it’s listening. EDIT: no it isn’t, I made one for gunicorn a year ago and forgot.

Installing cairo libraries from yum, if you installed graphite from yum, is safe. In this case you weren’t even missing the python module for cairo, just the development libraries it was looking for. Probably a quirk with dependency management on EPEL’s part.

then i really don’t understand why graphite was working until friday/today if cairo was missing in general. why wasn’t it a problem before?

You said you installed Icingaweb2 Reporting. Which commands you used for that ? Maybe one of them changed something but I cant really say what happended without seeing the system.

I think if you monitor your server for some time and the problem doesnt come back then everything is fine :slight_smile:

i did it exactly as it is documented. my first idea was that it could’ve been the database schema changes but carbon had no problem getting the values so that couldnt be the problem i guess. the reporting installation was basically module activation & database changes.

cairo is a dependency of graphite, but not of carbon, so fortunately you didn’t lose any data. I can’t really see any correlation here. You can try running yum history to see if any undocumented changes have been made recently. If it did work on that machine, all I could think of is cairo-devel was uninstalled somehow. Since the reporting module isn’t shipped via an rpm, it shouldn’t have had the opportunity to.

35 | install cairo-devel      | 2019-07-29 13:15 | I, U           |   35
34 | install mesa-libOSMesa m | 2019-07-26 15:55 | I, U           |   15
33 | install google-chrome-st | 2019-07-26 15:54 | I, U           |   49 EE
32 | -y install epel-release  | 2019-07-26 15:52 | Update         |    1

these are the only entries from yum history.

so, what are your experiences with updating icinga/icingaweb? could it cause problems with my graphite configuration?