Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kernel-based (Linux) data relay between two TCP sockets

I wrote TCP relay server which works like peer-to-peer router (supernode).

The simplest case are two opened sockets and data relay between them:

clientA <---> server <---> clientB

However the server have to serve about 2000 such A-B pairs, ie. 4000 sockets...

There are two well known data stream relay implementations in userland (based on socketA.recv() --> socketB.send() and socketB.recv() --> socketA.send()):

  • using of select / poll functions (non-blocking method)
  • using of threads / forks (blocking method)

I used threads so in the worst case the server creates 2*2000 threads! I had to limit stack size and it works but is it right solution?

Core of my question:

Is there a way to avoid active data relaying between two sockets in userland?

It seems there is a passive way. For example I can create file descriptor from each socket, create two pipes and use dup2() - the same method like stdin/out redirecting. Then two threads are useless for data relay and can be finished/closed. The question is if the server should ever close sockets and pipes and how to know when the pipe is broken to log the fact?

I've also found "socket pairs" but I am not sure about it for my purpose.

What solution would you advice to off-load the userland and limit amount fo threads?

Some extra explanations:

  • The server has defined static routing table (eg. ID_A with ID_B - paired identifiers). Client A connects to the server and sends ID_A. Then the server waits for client B. When A and B are paired (both sockets opened) the server starts the data relay.
  • Clients are simple devices behind symmetric NAT therefore N2N protocol or NAT traversal techniques are too complex for them.

Thanks to Gerhard Rieger I have the hint:

I am aware of two kernel space ways to avoid read/write, recv/send in user space:

  • sendfile
  • splice

Both have restrictions regarding type of file descriptor.

dup2 will not help to do something in kernel, AFAIK.

Man pages: splice(2) splice(2) vmsplice(2) sendfile(2) tee(2)

Related links:

  • Understanding sendfile() and splice()
  • http://blog.superpat.com/2010/06/01/zero-copy-in-linux-with-sendfile-and-splice/
  • http://yarchive.net/comp/linux/splice.html (Linus)
  • C, sendfile() and send() difference?
  • bridging between two file descriptors
  • Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
  • http://ogris.de/howtos/splice.html
like image 962
nopsoft Avatar asked Jul 11 '13 10:07

nopsoft


People also ask

What is Tcp_wmem?

tcp_wmem (since Linux 2.4) This is a vector of 3 integers: [min, default, max]. These parameters are used by TCP to regulate send buffer sizes. TCP dynamically adjusts the size of the send buffer from the default values listed below, in the range of these values, depending on memory available.

What is socket buffer?

Socket buffers are the short queues of packets the kernel holds on behalf of your app, as it's shuffling data between the NIC and your app's memory space.

What is TCP sockets in Linux?

TCP socket is a connection-oriented socket that uses the Transmission Control Protocol (TCP). It requires three packets to set up a connection: the SYN packet, the SYN-ACK packet, and the ACK packet. TCP socket is defined by the IP address of the machine and the port it uses.


1 Answers

OpenBSD implements SO_SPLICE:

  • relayd asiabsdcon2013 slides / paper
  • http://www.manualpages.de/OpenBSD/OpenBSD-5.0/man2/setsockopt.2.html
  • http://metacpan.org/pod/BSD::Socket::Splice .

Does Linux support something similar or only own kernel-module is the solution?

  • TCPSP
  • SP-MOD described here
  • TCP-Splicer described here
  • L4/L7 switch
  • HAProxy
like image 125
nopsoft Avatar answered Sep 19 '22 11:09

nopsoft