Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best approach for writing a Linux Server in C (phtreads, select or fork ? )

Tags:

c++

c

linux

sockets

i got a very specific question about server programming in UNIX (Debian, kernel 2.6.32). My goal is to learn how to write a server which can handle a huge amount of clients. My target is more than 30 000 concurrent clients (even when my college mentions that 500 000 are possible, which seems QUIIITEEE a huge amount :-)), but i really don't know (even whats possible) and that is why I ask here. So my first question. How many simultaneous clients are possible? Clients can connect whenever they want and get in contact with other clients and form a group (1 group contains a maximum of 12 clients). They can chat with each other, so the TCP/IP package size varies depending on the message sent. Clients can also send mathematical formulas to the server. The server will solve them and broadcast the answer back to the group. This is a quite heavy operation.

My current approach is to start up the server. Than using fork to create a daemon process. The daemon process binds the socket fd_listen and starts listening. It is a while (1) loop. I use accept() to get incoming calls.

Once a client connects I create a pthread for that client which will run the communication. Clients get added to a group and share some memory together (needed to keep the group running) but still every client is running on a different thread. Getting the access to the memory right was quite a hazzle but works fine now.

In the beginning of the programm i read out the /proc/sys/kernel/threads-max file and according to that i create my threads. The amount of possible threads according to that file is around 5000. Far away from the amount of clients i want to be able to serve. Another approach i consider is to use select () and create sets. But the access time to find a socket within a set is O(N). This can be quite long if i have more than a couple of thousands clients connected. Please correct me if i am wrong.

Well, i guess i need some ideas :-)

Groetjes Markus

P.S. i tag it for C++ and C because it applies to both languages.

like image 217
markus_p Avatar asked Apr 02 '12 14:04

markus_p


2 Answers

The best approach as of today is an event loop like libev or libevent.

In most cases you will find that one thread is more than enough, but even if it isn't, you can always have multiple threads with separate loops (at least with libev).

Libev[ent] uses the most efficient polling solution for each OS (and anything is more efficient than select or a thread per socket).

like image 136
a sad dude Avatar answered Oct 10 '22 05:10

a sad dude


You'll run into a couple of limits:

  1. fd_set size: This is changable at compile time, but has quite a low limit by default, this affects select solutions.
  2. Thread-per-socket will run out of steam far earlier - I suggest putting the longs calculations in separate threads (with pooling if required), but otherwise a single thread approach will probably scale.

To reach 500,000 you'll need a set of machines, and round-robin DNS I suspect.

TCP ports shouldn't be a problem, as long as the server doesn't connection back to the clients. I always seem to forget this, and have to be reminded.

File descriptors themselves shouldn't be too much of a problem, I think, but getting them into your polling solution may be more difficult - certainly you don't want to be passing them in each time.

like image 33
Douglas Leeder Avatar answered Oct 10 '22 04:10

Douglas Leeder