To add, once there is such a connection (verify that with netstat or tcpdump), each host may initiate sending TCP packages over the wire.
The underlaying cluster protocol is JSON-RPC with notifications, so the sender doesn’t wait for the receiver to acknowledge the message receive. Instead, a check execution is sent from the master, triggered via web action for example, to the satellite.
The satellite on its own decides how to deal with it - local execution, or remote command endpoint on the client. Once the plugin returns data, the instance parses that and decides which zones are responsible for the object. This triggers a cluster event with e.g. sending the check result from the satellite back to the master.
If the connection between the master and satellite was cut off, such a message will be stored in the replay logs. The side which does the initial connection, retries in a regular interval to do so. Once the connection is re-established, past cluster events are replayed and your master’s history backend has everything again.
This was designed and implemented mainly for SLA reporting reasons. A cluster connection being cut off must not influence the actual service check being run in a different location/zone.
You can read more about JSON-RPC messages in the docs released with 2.10.5 yesterday.