MolbioUnige
(Molbio Unige)
September 11, 2019, 11:47am
1
Hi,
I’d like to be sure that 2 particular processes are running on a host. What I’m looking for are the following ones:
ps -ef | grep dsmc
root 2570 1 0 11:08 ? 00:00:00 /opt/tivoli/tsm/client/ba/bin/dsmcad
root 5324 2570 0 11:09 ? 00:00:00 /opt/tivoli/tsm/client/ba/bin/dsmc schedule /tmp/fileXpiT1y -optfile=/opt/tivoli/tsm/client/ba/bin/dsm.opt
I’ve made the following service, using Director:
template Service "check_dsmc" {
import "check processes"
check_period = "Work Days"
notes = "Checks that dsmc (tsm) is running"
vars.procs_argument_regex = "dsmc"
vars.procs_critical = "2:2"
vars.procs_user = "root"
}
In IcingaWeb, I get the following critical error
Plugin Output
PROCS CRITICAL: 1 process with regex args ‘dsmc’, UID = 0 (root)
When I inspect the object it reads the following
Executed Command
‘/usr/lib/nagios/plugins/check_procs’ ‘–ereg-argument-array’ ‘dsmc’ ‘-c’ ‘2:2’ ‘-u’ ‘root’ ‘-w’ ‘250’
Executing that command “by hand” on the monitored server I get the following result:
user@host:~$ '/usr/lib/nagios/plugins/check_procs' '--ereg-argument-array' 'dsmc' '-c' '2:2' '-u' 'root' '-w' '250'
PROCS OK: 2 processes with regex args 'dsmc', UID = 0 (root) | procs=2;250;2:2;0;
What am I doing wrong?
Enable debug log and check what command and parameters are really used. Also run then the check as the same user on the host as icinga does.
MolbioUnige
(Molbio Unige)
September 11, 2019, 12:29pm
3
I’m not sure what I should be looking for
This is what I could extract from /var/log/icinga2/debug.log
[2019-09-11 12:06:53 +0000] notice/Process: Running command ‘/usr/lib/nagios/plugins/check_procs’ ‘–ereg-argument-array’ ‘dsmc’ ‘-c’ ‘2:2’ ‘-u’ ‘root’ ‘-w’ ‘250’: PID 2170
[2019-09-11 12:06:53 +0000] debug/CheckerComponent: Check finished for object ‘sc.molbio.unige.ch!check_dsmc’
[2019-09-11 12:06:53 +0000] notice/Process: PID 2170 (‘/usr/lib/nagios/plugins/check_procs’ ‘–ereg-argument-array’ ‘dsmc’ ‘-c’ ‘2:2’ ‘-u’ ‘root’ ‘-w’ ‘250’) terminated with exit code 2
[2019-09-11 12:06:53 +0000] debug/Checkable: Update checkable ‘sc.molbio.unige.ch!check_dsmc’ with check interval ‘300’ from last check time at 2019-09-11 12:06:53 +0000 (1.5682e+09) to next check time at 2019-09-11 12:11:44 +0000(1.5682e+09).
[2019-09-11 12:06:53 +0000] debug/DbEvents: add checkable check history for ‘sc.molbio.unige.ch!check_dsmc’
[2019-09-11 12:06:54 +0000] debug/IdoMysqlConnection: Query: UPDATE icinga_servicestatus SET acknowledgement_type = ‘0’, active_checks_enabled = ‘1’, check_command = ‘procs’, check_source = ‘scrutinizer’, check_timeperiod_object_id = 257, check_type = ‘0’, current_check_attempt = ‘1’, current_notification_number = ‘0’, current_state = ‘2’, endpoint_object_id = 211, event_handler_enabled = ‘1’, execution_time = ‘0.016989’, flap_detection_enabled = ‘0’, has_been_checked = ‘1’, instance_id = 1, is_flapping = ‘0’, is_reachable = ‘1’, last_check = FROM_UNIXTIME(1568203613), last_hard_state = ‘2’, last_hard_state_change = FROM_UNIXTIME(1568203315), last_state_change = FROM_UNIXTIME(1568201596), last_time_critical = FROM_UNIXTIME(1568203613), latency = ‘0.735826’, long_output = ‘’, max_check_attempts = ‘3’, next_check = FROM_UNIXTIME(1568203904), next_notification = FROM_UNIXTIME(1568205043), normal_check_interval = ‘5’, notifications_enabled = ‘1’, original_attributes = ‘null’, output = 'PROCS CRITICAL: 1 process with regex args 'dsmc', UID = 0 (root) ', passive_checks_enabled = ‘1’, percent_state_change = ‘5.600000’, perfdata = ‘procs=1;250;2:2;0;’, problem_has_been_acknowledged = ‘0’, process_performance_data = ‘1’, retry_check_interval = ‘1’, scheduled_downtime_depth = ‘0’, service_object_id = 273, should_be_scheduled = ‘1’, state_type = ‘1’, status_update_time = FROM_UNIXTIME(1568203613) WHERE service_object_id = 273
In the last line, the check_source
is ‘scrutinizer’, which is not the host I’m trying to monitor. Would that be the problem? Because if I’m running the command on scrutiniser I do get a critical warning.
add command_endpoint = "hostwhereitshouoldrunon"
or if applied via apply rule command_endpoint = host.name
MolbioUnige
(Molbio Unige)
September 16, 2019, 6:52am
5
Thank you for your help. I didn’t set “Run on agent”. I thought this was “automatic” as the test is run on a different machine than Icinga. I understand now why this can not be.