How to Avoid ICMP identifiers colliding

Someone · August 14, 2020, 2:39pm

This post is not directly related to Icinga, but rather monitoring in general, feel free to move it to Meta Icinga if needed.

How to avoid ICMP identifiers colliding ?

When you run a lot of icmp tests from a single machine, it may happens you end up with a weird behavior like this using ping command, but you’ll have something similar for everything using icmp echo request/reply :

PING 4.5.6.7 (4.5.6.7) 56(84) bytes of data.
64 bytes from 15.47.65.4: icmp_seq=1 ttl=252 time=59.0 ms
64 bytes from 15.47.65.4: icmp_seq=2 ttl=252 time=46.5 ms
64 bytes from 15.47.65.4: icmp_seq=3 ttl=252 time=66.8 ms
64 bytes from 15.47.65.4: icmp_seq=4 ttl=252 time=40.7 ms
64 bytes from 15.47.65.4: icmp_seq=5 ttl=252 time=51.8 ms
64 bytes from 15.47.65.4: icmp_seq=6 ttl=252 time=32.4 ms
64 bytes from 15.47.65.4: icmp_seq=7 ttl=252 time=31.1 ms
64 bytes from 15.47.65.4: icmp_seq=8 ttl=252 time=43.4 ms
64 bytes from 15.47.65.4: icmp_seq=9 ttl=252 time=30.6 ms
64 bytes from 15.47.65.4: icmp_seq=10 ttl=252 time=31.0 ms
64 bytes from 15.47.65.4: icmp_seq=11 ttl=252 time=56.4 ms
64 bytes from 248.67.25.9: icmp_seq=1 ttl=124 time=63.5 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=2 ttl=124 time=35.9 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=3 ttl=124 time=47.8 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=4 ttl=124 time=33.6 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=5 ttl=124 time=59.0 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=6 ttl=124 time=33.9 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=7 ttl=124 time=33.6 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=8 ttl=124 time=61.8 ms (DUP!)
64 bytes from 248.67.25.9: icmp_seq=9 ttl=124 time=34.7 ms (DUP!)

If you hold some bases in networking, you’ll likely be surprised and wonder what the heck is going on.
Let’s analyze how it can happen and how to solve it since it can break your icmp based monitoring.

Pretty much like most of network protocols, ICMP have an identifier field in its frame format to allow the operating system to identify different ongoing conversations and properly process each of them, this field is set on 16 bits, that means it can hold 65536 values.
For the OS side, for Linux, the icmp identifier is set using the pid of the process using this piece of code from the iputils package :

ident = htons(getpid() & 0xFFFF);

So, what does it mean ?
Let’s take this example :
I’m starting a ping command to ip 1.2.3.4, OS gives it the pid 602, it’s hex value is 0x025A
I’m starting an other ping command to 4.5.6.7, OS gives it the pid 983642, it’s hex value is 0xF025A, but it’ll be actually truncated to 0x025A since the icmp identifier field is limited to 16 bits.

I end up with pinging two different ip but with icmp frames sharing the same identifier, so the OS and ping command will mess up as shown before.

Now, how do i deal with it ?
You can choose to limit the max pid value the OS can allocate to make sure match with the max size of the icmp identifier field (65535) so that two ping will never have the same identifier, to do this, the solution is quite straightforward, you need to add this in sysctl.conf to make it permanent :
kernel.pid_max = 65535

Actually, you are likely to be concerned if you are using a big box to run your icmp tests, by default the system define the pid_max value based on nr_cpu_ids * 1024.
nr_cpu_ids is defined by your hardware and how its insides are wired, it represents the maximum number of cpu it can support, so for a beefy server as you guess pid_max can get pretty high !
You can actually get this information from dmesg, for example :

  $ grep -e pid_max -e nr_cpu_ids /var/log/dmesg
  [    0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:448 nr_cpu_ids:448 nr_node_ids:2
  [    0.000176] pid_max: default: 458752 minimum: 3584

pid_max = nr_cpu_ids * 1024 = 458752

Limiting max_pid may not be a good solution since it can have side effects if you run constantly a very large number of process near the 65k limit.

There is an other way, by default, ping mechanism is using a setuid binary to open a raw socket (SOCK_RAW in the kernel and linux docs), but a feature can allow ping to use SOCK_DGRAM instead.
Like before, it can be defined in sysctl by using net.ipv4.ping_group_range, it is disabled by default with value "1 0".
"0 65536" allow all gid from 0 to 65536 to use the SOCK_DGRAM socket, if you need to be more accurate, for a specific user dedicated to monitoring (like the icinga one for example), you could use a setting of "990 990" that would let only his group to use the non-raw sockets.

kunsjef · December 19, 2023, 12:18pm

I need to drop a quick note to express my appreciation for this particular post that solved our problem with icmp packets colliding - a super annoying issue. Thank you random stranger for writing your findings and your analysis of the problem