Is there a reason why I should use application level heartbeating instead of TCP keepalives to detect stale connections, given that only Windows and Linux machines are involved in our setup?
The short answer is yes there is a timeout enforced via TCP Keep-Alive, so no the socket won't remain open forever but will probably time out after a few hours.
A heartbeat is a type of a communication packet that is sent between nodes. Heartbeats are used to monitor the health of the nodes, networks and network interfaces, and to prevent cluster partitioning.
The default is 300 seconds. The Keep Alive Interval setting in the TCP profile is used to adjust the frequency at which the BIG-IP system sends TCP Keep-Alive packets to a remote host for connection validation.
The TCP Keepalive Timer feature provides a mechanism to identify dead connections. When a TCP connection on a routing device is idle for too long, the device sends a TCP keepalive packet to the peer with only the Acknowledgment (ACK) flag turned on.
It seems that the TCP keepalive parameters can't be set on a per-socket basis on Windows or OSX, that's why.
Edit: All parameters except the number of keepalive retransmissions can in fact be set on Windows (2000 onwards) too: http://msdn.microsoft.com/en-us/library/windows/desktop/dd877220%28v=vs.85%29.aspx
I was trying to do this with zeromq, but it just seems that zeromq does not support this on Windows?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With