After writing an answer about TCP_NODELAY and TCP_CORK, I realized that my knowledge of TCP_CORK's finer points must be lacking, since it's not 100% clear to me why the Linux developers felt it necessary to introduce a new TCP_CORK flag, rather than just relying on the application to set or clear the existing TCP_NODELAY flag at the appropriate times.
In particular, if I have a Linux application that wants to send() some small/non-contiguous fragments of data over a TCP stream without paying the 200mS Nagle latency-tax, and at the same time minimize the number of packets needed to send it, I can do it either of these two ways:
With TCP_CORK (pseudocode):
int optval = 1;
setsockopt(sk, SOL_TCP, TCP_CORK, &optval, sizeof(int)); // put a cork in it
send(sk, ..);
send(sk, ..);
send(sk, ..);
optval = 0;
setsockopt(sk, SOL_TCP, TCP_CORK, &optval, sizeof(int)); // release the cork
or with TCP_NODELAY (pseudocode):
int optval = 0;
setsockopt(sk, IPPROTO_TCP, TCP_NODELAY, &optval, sizeof(int)); // turn on Nagle's
send(sk, ..);
send(sk, ..);
send(sk, ..);
optval = 1;
setsockopt(sk, IPPROTO_TCP, TCP_NODELAY, &optval, sizeof(int)); // turn Nagle's back off
I've been using the latter technique for years with good results, and it has the benefit of being portable to non-Linux OS's as well (although outside of Linux you have to call send() again after turning Nagle's back off, in order to ensure the packets get sent immediately and avoid the Nagle-delay -- send()'ing zero bytes is sufficient).
Now the Linux devs are smart guys, so I doubt that the above usage of TCP_NODELAY never occurred to them. There must be some reason why they felt it was insufficient, which led them to introduce a new/proprietary TCP_CORK flag instead. Can anybody explain what that reason was?
You have two questions:
First see the answers in this Stack Overflow Question, because the are related in the since that question generally describes the difference between the two without reference to your usecase.
This means in your given use case in the first example no partial frames are sent until the end, but in your second example partial frames with a receiving acknowledgement will be sent.
Also the final send in your first example , Nagle's algorithm still applies to the partial frames after the uncorking , where as in the second example it doesn't.
The short version is the TCP_NODELAY sends doesn't accumulate the logical packets before sending then as network packets, Nagle's algorithm does according the algorithm, and TCP_CORK does according to the application setting it.
A side effect of this is that Nagle's algorithm will send partial frames on an idle connection, TCP_CORK won't.
Additionally TCP_CORK was introduced into the Linux Kernel in 2.2 (specifically 2.1.127 see here), but until 2.5.71 it was mutually exclusive with TCP_NODELAY. E.g In 2.4 kernels you could use one or the other, but in 2.6 you can combine the two, and TCP_CORK will take precedence when it is applied.
Regarding your second question.
To quote Linus Torvalds
Now, TCP_CORK is basically me telling David Miller that I refuse to play games to have good packet size distribution, and that I wanted a way for the application to just tell the OS: I want big packets, please wait until you get enough data from me that you can make big packets.
Basically, TCP_CORK is a kind of "anti-nagle" flag. It's the reverse of "no-nagle".
Another quote also by Linus is regarding usage of TCP_CORK is the following
Basically, TCP_CORK is useful whenever the server knows the patterns of its bulk transfers. Which is just about 100% of the time with any kind of file serving.
For more quotes see the link with Sendfile Mailing List Discussion.
In summary, in addition to TCP_MAXSEG and MSGMORE when calling writev, TCP_CORK is another tool which allows the application in userspace to have more fine grained control over packet size distribution.
References and further reading
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With