Icinga2 secondary api doesn't work anymore


Apparently secondary icinga2 node api is broken. I see these errors in the log. This is on Icinga2 2.5.4 cluster. The master is working fine. icinga2 daemon -C doesn’t show any error.

    critical/SSL: Error on bio X509 AUX reading pem file '/var/lib/icinga2/ca/ca.crt': 33558530, "error:02001002:system library:fopen:No such file or directory"

    [2019-11-19 12:17:13 -0800] warning/JsonRpcConnection: Error while processing message for identity 'sat1.test.com'  Error: std::exception (0) libbase.so: void  boost::throw_exception<icinga::openssl_error>(icinga::openssl_error const&) (+0x97) [0x7fb0d6d308a7]
        (1) libbase.so: void boost::exception_detail::throw_exception_<icinga::openssl_error>(icinga::openssl_error const&, char const*, char const*, int) (+0x40) [0x7fb0d6d30950]
        (2) libbase.so: icinga::GetX509Certificate(icinga::String const&) (+0x286) [0x7fb0d6ccb8a6]
        (3) libremote.so: <unknown function> (+0xcf443) [0x7fb0d634c443]
        (4) libremote.so: boost::detail::function::function_invoker2<icinga::Value (*)(boost::intrusive_ptr<icinga::MessageOrigin> const &  boost::intrusive_ptr<icinga::Dictionary> const&), icinga::Value, boost::intrusive_ptr<icinga::MessageOrigin> const&, 
boost::intrusive  _ptr<icinga::Dictionary> const&>::invoke(boost::detail::function::function_buffer&,boost::intrusive_ptr<icinga::MessageOrigin> const&,
boost::intrusive_ptr<icinga::Dictionary> const&) (+0xf) [0x7fb0d639041f]
        (5) libremote.so: icinga::ApiFunction::Invoke(boost::intrusive_ptr<icinga::MessageOrigin> const&, boost::intrusive_ptr<icinga::D
ictionary> const&) (+0x1d) [0x7fb0d633cb7d]
        (6) libremote.so: icinga::JsonRpcConnection::MessageHandler(icinga::String const&) (+0x497) [0x7fb0d638a937]
        (7) libremote.so: icinga::JsonRpcConnection::MessageHandlerWrapper(icinga::String const&) (+0x4b) [0x7fb0d638c00b]
        (8) libbase.so: icinga::WorkQueue::WorkerThreadProc() (+0x492) [0x7fb0d6cdb542]
        (9) libboost_thread.so.1.53.0: <unknown function> (+0xc5c3) [0x7fb0d77535c3]
        (10) libpthread.so.0: <unknown function> (+0x7aa1) [0x7fb0d4b6caa1]
        (11) libc.so.6: clone (+0x6d) [0x7fb0d48b993d]

Seems the quote thing doesn’t work well on here. I don’t see the directory /var/lib/icinga2/ca on the host

I’ve edited the topic with the three backticks block in Markdown.

For the original question - 2.5.4 is long EOL and not supported anymore. How exactly is your cluster built? Please share the zones.conf from all involved endpoints.

Thanks Michael. I will get the zones info soon. Does the openssl indicate that it failed to get the certificate ? not sure about the there…

Does the secondary have to have /var/lib/icinga2/ca directory and the certs there? I thought it is only on the master.

I think I figured out the openssl trouble. The “sat1.test.com” was updated to 2.10.4-1 and that might explain the “unknown-function” issue. I will downgrade it to 2.5.4 which should fix it.

However I still don’t understand why the secondary is complaining about ‘/var/lib/icinga2/ca/ca.crt’

Instead of downgrading I’d recommend upgrading. You’re 6 versions behind, and doing that later will make it even harder. Likely the secondary master received a signing request or similar from changes in 2.8. Since then you cannot run one master older than 2.8 and one 2.8+.

Somehow I did suspect that those updated satellites may have been making those requests right before the openssl error. I just wasn’t sure if they were related and I couldn’t find anything related to that in the 2.10.x changelog. Looking at the 2.8.0 changelog, what you said does make sense.
We are building a new separate cluster using latest Icinga2 2.11.x instead of doing in-box upgrade of the 2.5.4 cluster.

Thanks much!

1 Like