Problems with iLo-Plugin

Hi!

I’ve installed on my test-environment the Plugin “check_ilo2_health” and I’ve defined the following Host- and Service-Template, Service-Check, CheckCommand, Host-Definitions and Apply-Rule:

Apply-Rule:

apply Service "ilo" to Host {
        check_command = "check_ilo2_health"
        display_name = "iLo"

        vars.ilo_address = host.vars.ilo.ip
        vars.ilo_user = host.vars.ilo.username
        vars.ilo_pwd = host.vars.ilo.password

        assign where match("ilo*", host.name)
        assign where host.name == "ilo-ilo-192.168.120.15"

}

Check-Command:

# /opt/itmonplugins/check_ilo2_health.pl --ilo3 -H 192.168.120.15 -u maze-m -p maze-m-pw

object CheckCommand "check_ilo2_health" {
        import "plugin-check-command"

        command = [ itmonplugins + "/check_ilo2_health.pl" ]

    arguments = {

        "-H" = "$ilo_address$"
        "-u" = "$ilo_user$"
        "-p" = "$ilo_pwd$"
        "-t" = "$ilo_timeout$"
        "-3" = {
           set_if = "$ilo_version_3$"
           description = "Version 3/4. Details: https://exchange.icinga.org/exchange/check_ilo2_health"
        }
        "-i" = {
           set_if = "$ilo_ignore_linkdown$"
           description = "Ignore NIC Link Down status (iLO4)"
}
        "-n" = {
           set_if = "$ilo_ignore_temperature$"
           description = "output without temperature listing."
        }
   }


   vars.ilo_version_3 = false
   vars.ilo_timeout = "60"
   vars.ilo_ignore_linkdown = false
   vars.ilo_ignore_temperature = false

}

Host- and Service-Template:

template Host "host-template-generice-host" {
  max_check_attempts = 3
  check_interval = 10s
  retry_interval = 10s

  check_command = "hostalive"
}

template Service "generic-service" {
  max_check_attempts = 5
  check_interval = 10s
  retry_interval = 10s
}

Host-Definition:

    object Host "ilo- 192.168.120.15" {
            import "host-template-generice-host"

            display_name      = "ILO"
            address       = "192.168.120.15"

            vars.ilo = {
                    ip = "192.168.120.15"
                    username = "maze-m"
                    password = "maze-m-pw"
            }
 }

When I do an “service icinga2 checkconfig” the syntax is fine and when I do an “service icinga2 restart” and have a look in icingaweb the Service is in the State “unknown” and the plugin-output says:
" <Timeout exceeded.><Terminated by signal 9 (Killed).>"

Does someone of you have an idea why I’ve this problem?

Thank you very much for our help.

The plugin takes too long to execute, and in order to avoid zombie processes, Icinga kills the execution.

Extract the command line for this service as explained in the troubleshooting docs, and test it as Icinga user. Likely it is either a firewall problem, or the ilo responds too slow.

Cheers,
Michael

1 Like

Thanks for your fast reply!

Can you give me the link for the troubleshooting docs so that I can understand the mistake?
And how can I test it as Icinga user?

I had a look in the Icinga Monitoring Basic Docs and there they wrote that it’s that it is necessary to have an object service but I only have an “apply Service”… Whats the different between these two?

Thanks a lot for your help!

Here you go: https://icinga.com/docs/icinga2/latest/doc/15-troubleshooting/

1 Like

In your code for the check command definition you have the following line:

There you use the --ilo3 switch. When looking at your Service apply rule and your host object you don’t set this variable to true and therefore use the default of false from the check command

Have you tried running the script (as the executing icinga2 user) from the command line to see if it works there and to figure out what parametes you need?

1 Like

You could try what “-d: add PerfParse compatible temperature output.” does. I have never tried it. And to be honest every sensor reading in a new line would make a huge output, which I don’t like.

If that does not work, you will have to change the script code that creates the output, I would assume.

1 Like

Why not use ipmi-sensor from the ITL?

2 Likes

Hi!

Thanks for all your replies!
We got it working by customizing the plugin itself and now the output looks good:

But now we got the “error” that there’s an Board-Battery missing on some iLo’s of our customer:

These customers want this message to be deactivated. There is a way to do this?

There is a “-x” switch that ignores missing battery warnings

"-x" = {
   set_if = "$ilo_ignorebatterymissing$"
   description = "Ignore Battery missing status"
}

Have you tried that?

/George

Ah, thanks!

No, I haven’t see this :open_mouth: :O.

That’ll help me a lot :slight_smile:

Great! When you are done with this, choose a solution and mark the thread as solved .

Thanks

@logic:

Thanks for your answer! It helped me a lot. I tested the plugin again and again and now it works fine :slight_smile: :)!

But the Output in Icingaweb2 doesn’t look so nice :(…:

How can I format it so that I have every value in an extra line?

Thank you very much for you help!

Hi,

what exactly did you modify in the plugin already, as unified diff? Newlines should be returned as \n by the plugin.

Cheers,
Michael

I have the same issue, no newline. I’ve not modified the plugin.

That is the default behavior of the plugin.
If you want line breaks inside the output you will have to modify the script (or hope that @maze-m shares is modified version :))

1 Like