Issues with Restarting a Service over an Eventcommand

Hi everyone

I’m somewhat new to Icinga, but I know my way around most settings by now, but I’m having this fairly tricky issue:

I’m currently monitoring quite some services via the Powershell command “Invoke-IcingaCheckService”

This as is works quite well, below you see my command structure:


object CheckCommand "Invoke-IcingaCheckService" {
    import "plugin-check-command"
    import "Powershell Base"

    arguments += {
        "-C" = {
            order = 0
            value = "Use-Icinga; exit Invoke-IcingaCheckService"
        }
        "-NoPerfData" = {
            description = "Disables the performance data output of this plugin"
            order = 99
        }
        "-Service" = {
            description = "Used to specify an array of services which should be checked against the status. Supports ‘*’ for wildcards."
            order = 2
            repeat_key = false
            value = {{
                var arr = macro("$IcingaCheckService_Array_service$")
            if (len(arr) == 0) {
            return null
            }
            return arr.join(", ")
            }}
        }
        "-Status" = {
            order = 4
            value = "$IcingaCheckService_String_Status$"
        }
        "-Verbosity" = {
            order = 98
            value = "$Invoke-IcingaCheckService_Int32_Verbosity$"
        }
    }
}

My service template in the resolved state then looks like this:

zones.d/master/service_templates.conf

template Service "Windows Agent Check Service with Restart" {
    import "generic-service"

    check_command = "Invoke-IcingaCheckService"
    max_check_attempts = "1"
    enable_active_checks = true
    enable_passive_checks = true
    enable_event_handler = true
    event_command = "Powershell Restart-service V3"
    command_endpoint = host_name
}


Now the issue is that I have defined a service restart script which is in a somewhat broken state and I have tried with AI and without to troubleshoot it and got kinda nowhere.
Maybe someone here in the community can pinpoint me in the right direction or help me debug things more.

Below is my event command and my Powershell script:

zones.d/director-global/commands.conf


object EventCommand "Powershell Restart-service V3" {
    import "plugin-event-command"
    command = [
        "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe"
    ]
    timeout = 1m
    arguments += {
        "-Command" = {
            order = 3
            value = {{
                var services = macro("$IcingaCheckService_Array_service$")
                                var state = macro("$service.state_id$")
                                var state_type = macro("$service.state_type$")

                                if (typeof(services) == Array) {
                                    services = services.join(",")
                                }

                                var stateText = "UNKNOWN"

                                if (state == 0) {
                                    stateText = "OK"
                                } else if (state == 2) {
                                    stateText = "CRITICAL"
                                } else if (state == 1) {
                                    stateText = "WARNING"
                                }

                                if (state_type != 1) {
                                    return "exit 0"
                                }

                                return "& { . 'C:\\\\Program Files\\\\ICINGA2\\\\sbin\\\\invoke-icingaRestartEventCommand.ps1'; invoke-icingaRestartEventCommand -Services '" + services + "' -ServiceState '" + stateText + "' }"
            }}
        }
        "-ExecutionPolicy" = {
            order = 2
            value = "Bypass"
        }
        "-NoProfile" = {
            order = 1
        }
    }
}

My script which I somewhat based on this one:

My Restart Script:


<#
.SYNOPSIS
    Starts a Windows service, can be used as an event plugin
.DESCRIPTION
    Starts a Windows service, can be used as an event plugin
.PARAMETER Services
    Comma-separated list of services to be started
.PARAMETER ServiceState
    Icinga check state (OK, WARNING, CRITICAL, UNKNOWN)
.PARAMETER ServiceStateType
    Service state type (SOFT, HARD)
.PARAMETER ServiceAttempt
    Service attempts
.EXAMPLE
    Event Plugin
.Source of the Script:
https://community.icinga.com/t/icinga-for-windows-how-to-restart-service-in-eventcommand/15273
#>

function invoke-icingaRestartEventCommand
{
    param(
        [string]$Services = "",
        [string]$ServiceState = "",
        [string]$ServiceStateType = "",
        [int]$ServiceAttempt = 0
    );

    $ServicesArray = ($Services -split ",").Trim()
    $status = ""

    if ($ServiceState -eq "CRITICAL") {
        foreach ($ServiceToRestart in $ServicesArray){
            try{
                $windowsService = Get-Service -Name $ServiceToRestart -ErrorAction SilentlyContinue

                if ($null -eq $windowsService) {
                    $status += "$($ServiceToRestart):not found "
                }
                elseif ($windowsService.Status -eq 'Running') {
                    Restart-Service -Name $ServiceToRestart -Force -ErrorAction Stop
                    $status += "$($ServiceToRestart):restarted "
                }
                else {
                    Start-Service -Name $ServiceToRestart -ErrorAction Stop
                    $status += "$($ServiceToRestart):started "
                }
            }
            catch{
                $status += "$($ServiceToRestart):not started "
            }
        }
    }
    else {
        $status = "not critical"
    }

    $IcingaCheck = New-IcingaCheck -Name 'StartService' -Value $status
    return (New-IcingaCheckResult -Check $IcingaCheck -Compile -NoPerfData $true);
}

Okay now to the main topic at hand:

At the current state it does not fire at all.

Icinga itself does detect that the service is critical but does nothing:

Here a log snippet:


Matched Line: [2026-04-14 13:09:07 +0200] notice/Process: PID 6752 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Ser
vice w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 0
Matched Line: [2026-04-14 13:09:54 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService
" -Service w3svc -Status running -Verbosity 2 -NoPerfData': PID 10180
Matched Line: [2026-04-14 13:10:02 +0200] notice/Process: PID 10180 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Se
rvice w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 2 #<--- Here I manually ended the service
Matched Line: [2026-04-14 13:10:06 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService
" -Service w3svc -Status running -Verbosity 2 -NoPerfData': PID 11200
Matched Line: [2026-04-14 13:10:12 +0200] notice/Process: PID 11200 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Se
rvice w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 2
Matched Line: [2026-04-14 13:11:00 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService
" -Service w3svc -Status running -Verbosity 2 -NoPerfData': PID 1384
Matched Line: [2026-04-14 13:11:09 +0200] notice/Process: PID 1384 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Ser
vice w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 2

The thing I noticed is if I do not declare ’ var stateText = “UNKNOWN” ’ it at least tries to run but at the moment I’m not sure which knob I should turn to make it work.

I also made sure that the script itself runs locally on the server.

If anyone has some insights I would really appreciate it.

Kind regards
Jan Zehnder

[SOLVED] Self-healing works now

For anyone running into the same issue, here’s what was wrong and what fixed it:

Problem

Event command Powershell Restart-service V3 never fired, even though the service check correctly reported CRITICAL.

Root causes

  1. max_check_attempts = 1 on the service template — this pushed the service straight to HARD CRITICAL on the first failed check. Event commands primarily fire during SOFT attempts (retries), so with only one attempt there was effectively no window for the handler to run repeatedly.
  2. State-type guard inverted inside the event command lambda:
    if (state_type != 1) { return "exit 0" }
    
    This told Icinga to skip the restart unless already HARD — the opposite of what you want for self-healing.

Fix:

Raised max_check_attempts to 3 on the service template.
Removed / inverted the state_type guard so the restart runs during SOFT CRITICAL attempts.

Result

On failure the service goes SOFT CRITICAL [1/3], the event handler restarts the Windows service, the next check returns OK, and no notification is triggered. If all three attempts fail, it escalates to HARD CRITICAL and notifies normally.

Logs confirm the full cycle:

Matched Line: [2026-04-24 09:04:10 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Service w3svc -Status running -Verbosity 2 -NoPerfData': PID 9872
Matched Line: [2026-04-24 09:04:22 +0200] notice/Process: PID 13644 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Service w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 2
Matched Line: [2026-04-24 09:04:22 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "& { . 'C:\Program Files\ICINGA2\sbin\invoke-icingaRestartEventCommand.ps1'; invoke-icingaRestartEventCommand -Services 'w3svc' -ServiceState 'CRITICAL' }"': PID 12088
Matched Line: [2026-04-24 09:04:26 +0200] notice/Process: PID 12088 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "& { . 'C:\Program Files\ICINGA2\sbin\invoke-icingaRestartEventCommand.ps1'; invoke-icingaRestartEventCommand -Services 'w3svc' -ServiceState 'CRITICAL' }"') terminated with exit code 0
Matched Line: [2026-04-24 09:04:29 +0200] notice/Process: PID 9872 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Service w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 0
Matched Line: [2026-04-24 09:04:50 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Service w3svc -Status running -Verbosity 2 -NoPerfData': PID 852
Matched Line: [2026-04-24 09:05:12 +0200] notice/Process: PID 852 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -C "Use-Icinga; exit Invoke-IcingaCheckService" -Service w3svc -Status running -Verbosity 2 -NoPerfData') terminated with exit code 0
Matched Line: [2026-04-24 09:05:12 +0200] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "& { . 'C:\Program Files\ICINGA2\sbin\invoke-icingaRestartEventCommand.ps1'; invoke-icingaRestartEventCommand -Services 'w3svc' -ServiceState 'OK' }"': PID 8792
Matched Line: [2026-04-24 09:05:14 +0200] notice/Process: PID 8792 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "& { . 'C:\Program Files\ICINGA2\sbin\invoke-icingaRestartEventCommand.ps1'; invoke-icingaRestartEventCommand -Services 'w3svc' -ServiceState 'OK' }"') terminated with exit code 0

Script update

I also updated my local restart script so it only starts the service when it’s actually down (instead of restarting an already-running service), and added simple size-based log rotation so I can audit what the event handler did:

<#
.SYNOPSIS
    Starts a windows service, can be used as an event plugin
.DESCRIPTION
    Starts a windows service, can be used as an event plugin
.PARAMETER Services
    Comma-separated list of services to be started
.PARAMETER ServiceState
    Icinga check state (OK, WARNING, CRITICAL, UNKNOWN)
.PARAMETER ServiceStateType
    Service state type (SOFT, HARD)
.PARAMETER ServiceAttempt
    Service attempts
.EXAMPLE
    Event Plugin
.Source
https://community.icinga.com/t/icinga-for-windows-how-to-restart-service-in-eventcommand/15273
#>

function invoke-icingaRestartEventCommand
{
    param(
        [string]$Services = "",
        [string]$ServiceState = "",
        [string]$ServiceStateType = "",
        [int]$ServiceAttempt = 0
    );
    $ServicesArray = ($Services -split ",").Trim()
    $status = ""
    if ($ServiceState -eq "CRITICAL") {
        foreach ($ServiceToRestart in $ServicesArray){
            try{
                $windowsService = Get-Service -Name $ServiceToRestart -ErrorAction SilentlyContinue
                if ($null -eq $windowsService) {
                    $status += "$($ServiceToRestart):not found "
                }
                elseif ($windowsService.Status -eq 'Running') {
                    $status += "$($ServiceToRestart):already running "
                }
                else {
                    Start-Service -Name $ServiceToRestart -ErrorAction Stop
                    $status += "$($ServiceToRestart):started "
                }
            }
            catch{
                $status += "$($ServiceToRestart):not started "
            }
        }

        # Only log when something actually happened (CRITICAL event)
        $logFile = "C:\ProgramData\icinga2\var\log\icinga2\restart-event.log"
        Add-Content -Path $logFile -Value "$(Get-Date -Format 'yyyy-MM-dd HH:mm:ss') [$ServiceState] $status"

        # Simple size-based rotation: if file > 5 MB, archive it
        if ((Test-Path $logFile) -and ((Get-Item $logFile).Length -gt 5MB)) {
            Move-Item $logFile "$logFile.old" -Force
        }
    }
    else {
        $status = "not critical"
    }
    $IcingaCheck = New-IcingaCheck -Name 'StartService' -Value $status
    return (New-IcingaCheckResult -Check $IcingaCheck -Compile -NoPerfData $true);
}