in these days I had a problem with icinga and the notifications, so I tried to restart the service (it was active) with systemctl restart icinga2.service he didn’t restarted and now is unavailable.
After, I noticed that a process (syslog-ng) was using the CPU at 160% so I changed a conf file and restarted the service…now it uses 32% of CPU and all is apparenlty fair.
So I thinked: “now Icinga2 don’t have any problem for restarting, let’s try to make it run”
I used systemctl restart icinga2.service but Icinga don’t want to come up!
systemctl status icinga2.service says that:
icinga2.service: Main process exited, code=killled, status=6/ABRT
Failed to start Icinga /host7service/network monitoring system.
icinga2.service: Unit entered failed state.
icinga2.service: Failed with result ‘signal’.
It also says that I can find more infos under /var/log/icinga2/crash but all the files are empty!
What can I do?
All the other service s are running…Docker, Apache2, mysql…
with icinga and the notifications, so I tried to restart the service (it was
active) with systemctl restart icinga2.service he didn’t restarted and now
is unavailable.
Anything useful in /var/log/icinga2/icinga2.log or
/var/log/icinga2/startup.log ?
After, I noticed that a process (syslog-ng) was using the CPU at 160% so I
changed a conf file
Which one?
What did you change?
and restarted the service…now it uses 32% of CPU and all is apparenlty
fair.
So, to be clear, syslog-ng is using 32% of your CPU?
That seems like a great deal to me.
So I thinked: “now Icinga2 don’t have any problem for restarting, let’s try
to make it run”
I used systemctl restart icinga2.service but Icinga don’t want to come up!
Anything useful in /var/log/icinga2/icinga2.log or
/var/log/icinga2/startup.log ?
systemctl status icinga2.service says that:
icinga2.service: Main process exited, code=killled, status=6/ABRT
Failed to start Icinga /host7service/network monitoring system.
icinga2.service: Unit entered failed state.
icinga2.service: Failed with result ‘signal’.
It also says that I can find more infos under /var/log/icinga2/crash but
all the files are empty!
The problem was that the notifications randomly stopped to be sent to my e-mail…
in /var/log icinga2.log doesn’t exist…same for startup.log
in /var/log/icinga2 directory i found only old logs and the “crash” directory have empty files inside.
I found these files:
in /var/log/error there is:
icingasrv systemd[1]: failed to start Icinga host/service/network monitoring service
icingasrv syslog-ng[16589]: I/O error occoured while writing; fd=‘56’, error=‘Broken pipe (32)’
sorry if I don’t report all the things but I’m in a sort of matrioska of RDP’s, VmWare ecc ecc ecc so I’m writing logs by hand
in /var/log/syslog there is:
icingasrv systemd[1]: icinga2.service: Failed with result ‘signal’.
And there is also an error that sounds like:
icingasrv icinga2[5616]: error: Function call ‘std::ifstream::open’ for file /NOW/IT/REFERS/TO/A/LONG/PATH failed with error code 2, ‘No such file or directory’
For syslog-ng, i changed the line with #ForwardToSyslog=yes in ForwardToSyslog=no in /etc/systemd/journald.conf…
Now it uses 32% of CPU but is better than 160%…
Holy moly, you should take immediate action here. Whenever daemons cannot write their logs to /var/log that’s really bad for their runtime, likewise /var/lib/mysql won’t be able to store things in the database.
Try
du -h --max-depth=1
to identify big directories. Highly likely logs are huge and need to be purged.
Dive deeper into the directories and figure out why log for instance is 5.2 GB in size. Try deleting non-required files and free up space. Oh, and spool has 20GB of data … did you by chance enable the perfdata feature but not use PNP for metrics?
This is related to any service on your host, all of them suffer right now and the CPU load is increasing because of that.
In terms of partitions I strongly recommend to create new ones for