Replacing NRPE by Icinga agent (each agent requires own zone?)

holobolo0815 · October 28, 2019, 1:38pm

Hey,

Since NRPE is no longer recommended to run local checks on remote hosts, we were looking at Icinga running as agent. Now the following line from the distributed monitoring doc worries us a little bit:

Each agent requires its own zone and endpoint configuration.

We have lots of hosts and don’t want to define a zone for each of them. We simply want to run local checks on remote hosts in a secure way.

Any thoughts?

Thanks.

rsx · October 28, 2019, 1:49pm

This is by design means zone and endpoint is crucial. In case you are using the director, he will do this job for you.

holobolo0815 · October 28, 2019, 7:28pm

So what’s the recommended way (best practice) to replace nrpe with sth secure? Check by ssh?

Thx

unic · October 29, 2019, 7:13am

Hi, we are using the Agents for Windows and check_by_ssh for most Linux systems.

dnsmichi · October 29, 2019, 11:07am

The Icinga Agent is the preferred recommended ways, even with the overhead of zones. Still, they add more security with a chain of trust next to TLS in the bottom layers.

Start simple by adding one agent, and learn about the concepts and setzp tooling. There’s CLI wizards acailable abstracting things. Also, peek into the Director with abstracting things.

Cheers from DevOpsDays,
Michael

Napsty · November 1, 2019, 8:13am

I’m curious where you got this information?
NRPE uses SSL/TLS encryption if you use it correctly. I’m still using NRPE on all environments across thousands of hosts and if implemented correctly, there is no security risk.

holobolo0815 · November 1, 2019, 2:04pm

@Napsty To answer your question:

https://icinga.com/docs/icinga2/latest/doc/07-agent-based-monitoring/

Tip
Best practice is to use the Icinga agent as secure execution bridge ( check_nt and check_nrpe are considered insecure) and query the NSClient++ service locally.

https://icinga.com/docs/icinga2/v2.11.0-rc1/doc/07-agent-based-monitoring/

Note

The NRPE protocol is considered insecure and has multiple flaws in its design. Upstream is not willing to fix these issues.
In order to stay safe, please use the native Icinga 2 client instead.

Also:
https://monitoring-portal.org/woltlab/index.php?thread/33500-icinga2-welchen-remote-agenten-nutzen-icinga-client-nrpe-check-by-ssh/&postID=216765#post216765
https://monitoring-portal.org/woltlab/index.php?thread/42550-nrpe-deprecated-and-insecure/&postID=259686#post259686

etc.

Napsty · November 1, 2019, 2:58pm

Well I’m not telling you what the best agent stategy is because that depends on your own experience and environment and you have to figure out what’s best for your strategy yourself. However note that this information:

The NRPE protocol is considered insecure and has multiple flaws in its
design. Upstream is not willing to fix these issues.

has been in the documentation for a long time (since Icinga 2.2.0 in 2014) and the NRPE project has (finally!) been revived in the meantime (https://github.com/NagiosEnterprises/nrpe), including security and TLS fixes and new releases. These releases (3.x) are meanwhile also available on new Linux distro versions. Although the Icinga documentation is good and trustworthy (I contributed, too), the rule of thumb of the Internet applies here: Don’t trust a single source without verification.

Of course blindly trusting NRPE isn’t good either. Firewall rules and correct configuration is a must! Only allow trusted remote connections to your NRPE daemon.

holobolo0815 · November 1, 2019, 3:21pm

Well, the single source of information telling that NRPE is fine seems to be the NRPE project itself. It might even be true in newer versions, as you describe. I will investigate in that direction.

When you look for NRPE generally you only find articles like those that I posted: docs, guides, forums etc. all advising you to rather not use NRPE…

dnsmichi · November 2, 2019, 12:21pm

Apparently that’s only one half of truth. Just with implementing TLS, you are still in exposure to MITM attacks. There are additional measurements required, like browsers validating the CN against the exposed FQDN or certificate authorities.

In addition to that, another layer of security with only allowing a list of CNs to connect and doing more than a TLS handshake, also is helpful. You can see that with the Zone hierarchy with Endpoint objects matching the CN in Icinga. That way only configured and known agents are really allowed to e.g. receive commands and send back check results.

Versions

NRPE v2 uses the precompiled anon DH secret, meaning to say that the exact same Debian binary can be used to trust each other. If allowed_hosts is set to any address, you can happily to talk to any NRPE agent in the world. Finding out about the used operation system and NRPE version is fairly easy these days, just trial and error.

TL;DR - TLS is broken in NRPEv2, don’t use it.

NRPE v3 optionally added certificate support next to the old DH methods.

Compatibility

v3 has one problem - it uses a different protocol than v2, and leaving away other agents implementing NRPE support, like NSClient++. So in case you’d want to monitor Windows agents with NSClient++, you are bound to NRPEv2 and anon DH TLS.

Security?

Also, NRPE does not enforce TLS by default. You need to go an extra mile with securing your environment which imho is the wrong strategy in modern times.

Certificates are also an extra step.

# Values: 0 = Don't ask for or require client certificates (default)

TLS Versions

And just that I read it, you are eligible to use older weak versions of TLS than 1.2, especially SSLv3. That is the default (!).

The ssl_version directive lets you set which versions of SSL/TLS you want to allow. SSLv2, SSLv3, TLSv1, TLSv1.1 and TLSv1.2 are allowed.

Ciphers

With Icinga 2.11, we’ve analyzed the available ciphers in TLS in deep. That made us remove common weak cipher suites.

Now when I look at the NRPEv3 defaults, this includes

#ssl_cipher_list=ALL:!MD5:@STRENGTH:@SECLEVEL=0

If you really want to use old and broken certificates use the custom configuration option tls-cipher “DEFAULT:@SECLEVEL=0”

Worth a read:
https://www.acunetix.com/blog/articles/tls-ssl-cipher-hardening/

Extensions

The CN is not written into the certificate subject following these instructions. Therefore additional measurements after TLS handshake cannot be taken into account on the application layer.

Also, revoked certificate lists are missing.

Remote Command Injection

One last thing: dont_blame_nrpe=1 means that arbitrary command execution can be added after any local plugin call. That can be used to override existing plugin parameters (e.g. to snoop into guessing mounted disks or to trigger exploits). The Debian packages have disabled this on compile time, other systems are still vulnerable to this.

The Icinga agent denies additional arguments and requires the commands being defined. You can even provide local check commands which remove certain arguments from the parent caller, if required.

For further security insights, please refer to the CVEs available in this regard.

Conclusion

That being said, we care about security and are therefore implementing everything to make agents reliable and secure. Other agent transports than Icinga we are able to recommend, are listed in our documentation.

NRPE is not one of them, not even with v3 and the weak default settings. You might be able to secure it, if you are TLS/SSL expert, but with the little documentation provided, I fairly doubt that. Also, the missing CN validation in the application layer is not a good sign. Therefore we discourage its usage and are not supporting it.

In terms of NSClient++ & NRPE as a transport, we are working on our solution already, more can be seen at OSMC next week.

Cheers,
Michael

Napsty · November 2, 2019, 2:43pm

Thanks for that thorough summary!

As already written, I’m not saying that NRPE is as secure as it should/could be because that heavily depends on where you put (your own) base line of security. I personally would never expose the NRPE daemon to the Internet - but some people might and this is where the risks start. The same goes for the dont_blame_nrpe setting which has also caused screaming against the maintainer. This setting in combination with definition of check_commands was not properly used by some users. That’s why the command args are by default not compiled in Debian packages. But again, if this is used correctly (!) you can avoid the risks.

I’m not defending NRPE in any way but I’m just saying that calling a SSL/TLS encrypted connection insecure in the official docs is misleading and unfair against another (concurrency) product. In this case even the Icinga 2 API would be attackable by MITM attacks (and yet, what isn’t if not validated against a client certificate or a certificate chain?). A warning that a wrongly configured NRPE daemon poses a security threat would imho be better (should I make a PR?).

That said: It is 2019 and I really wished that NRPE could be tuned for additional security. Even a simple user authentication with encrypted credentials sent from the check_nt client to the NRPE daemon would help a lot! (a bit like NSClient++ does it). Maybe we should fork NRPE?

dnsmichi · November 2, 2019, 5:25pm

I wouldn’t call it “unfair” to name the NRPE security flawed. Also, it originates from a different vendor, one which doesn’t support Icinga and is in a hate relationship with us. I wouldn’t want to expose a customer into a scenario where you cannot get official support, or one side decides to cut off, or become unmaintained. Who knows how long NRPE is actually developed in the future, or receives (security) bugfix releases.

The Icinga docs were always a bit back and fourth. With 2.11, the decision was made to only propose one way - the Icinga agent, and on systems where this is not applicable, SSH. SNMP solves a different problem mainly on network hardware. No PRs in this regard is necessary.

You can of course use NRPE, even with certificates if you manage to keep up with that. It just doesn’t have additional comfort layers like CSR auto/on-demand signing in distributed environments, so people will still go the route with lesser security then. I haven’t found something on Google for making NRPE secure with TLS rather than cumbersome openssl commands.

If you want to go the forking route, many have tried that already and failed, including myself - there was a time where we had irpe with certificate handling & IPv6 but it never got released in favor of Icinga 2. nrpe-ng has good attempts in it, but seems unmaintained. I wouldn’t bother with the C code either in times where Golang exists.

To me, NRPE is legacy and that’s my last comment on the matter.

Cheers,
Michael

MarcusCaepio · November 5, 2019, 4:44pm

Hey all,
thanks to @dnsmichi for this very detailed explanation.
What do you guys think about monitoring windows via WMI and check_wmi_plus?
Also what do you think about Powershell via SSH, which comes with the latest Windows Server Versions?

Cheers,
Marcus

unic · November 6, 2019, 10:28am

With check wmi you need to have old wmic and i am not sure if that is the way to go for the future as it not activly developed anymore afaik. On my test i had a lot of problems to get it up and running on Ubuntu. But i not tried this nice Howto: Monitoring windows remotely through WMI

I tried Powershell core on Linux and then used ps-remote sessions with winRM. That worked very well, even without ssh. But you need to rewrite every check, as you don’t get exitcodes from that. So you need to return exitcode as sting and interpret it. Would be a nice project

MarcusCaepio · November 6, 2019, 11:00am

I wouldn’t say it’s old. A new version of check_wmi_plus was released last month. And WMI is still valid. I like that on the one side you have a user and password authentication especially for WMI and on the other side an easy handling via GPO. You really should have a watch on the tutorial. It’s not that hard to get WMI running. And you can create a bunch of custom checks based on any Windows Counter.