I'm designing a distributed server/client system with C++, in which many clients send request to many servers through TCP and server throw a thread to handle the request and send back it's response. In my use case only limited number of clients will access the server and I need very high performance.The data sent from client and server are all small, but are very frequent. So creating a connection and tearing it down it after use is expensive. So I want to use connection caching to solve this problem: once connection created, it will be stored in a cache for future use.(Assume that the number of clients will not beyond the size of cache).
My question is:
Any answer and suggestion will be appreciated. Or any one can give me an example of connection pool or connection caching?
I saw someone said that connection pooling is a client side technique. ... if no making connection, who would trigger accept() in server side and to throw a thread?
Firstly, connection pooling is not just a client-side technique; it's a connection-mode technique. It applies to both types of peer (the "server" and the "client").
Secondly, accept
doesn't need to be called to start a thread. Programs can start threads for any reason they like... They could start threads just to start more threads, in a massively parallelised loop of thread creation. (edit: we call this a "fork bomb")
Finally, an efficient thread-pooling implementation won't start a thread for each client. Each thread typically occupies between 512KB-4MB (counting stack space and other context information), so if you have 10000 clients each occupying that much, that's a lot of wasted memory.
I want to do so, but just don't know how to do it in multithreading case.
You shouldn't use multithreading here... At least, not until you have a solution that uses a single thread, and you decide that it's not fast enough. At the moment you don't have that information; you're just guessing, and guessing doesn't guarantee optimisation.
At the turn of the century there were FTP servers that solved the C10K problem; they were able to handle 10000 clients at any given time, browsing, downloading or idling as users tend to do on FTP servers. They solved that problem not by using threads, but by using non-blocking and/or asynchronous sockets and/or calls.
To clarify, those web servers handled thousands of connections on a single thread! One typical way is to use select
, but I'm not particularly fond of that method because it requires a rather ugly series of loops. I prefer to use ioctlsocket
for Windows and fcntl
for other POSIX OSes to set the file descriptor into non-blocking mode, e.g.:
#ifdef WIN32
ioctlsocket(fd, FIONBIO, (u_long[]){1});
#else
fcntl(fd, F_SETFL, fcntl(fd, F_GETFL, 0) | O_NONBLOCK);
#endif
At this point, recv
and read
won't block when operating on fd
; if there's no data available, they'll return an error value immediately rather than waiting for data to arrive. That means you can loop on multiple sockets.
If connection pooling also need to be implemented in server side, how can I know where a request come from?
Store the client fd
along-side its struct sockaddr_storage
and any other stateful information you need to store about clients, in a struct
that you declare however you feel. If this ends up being 4KB (which is a fairly large struct
, usually about as large as they need to get) then 10000 of these will only occupy about 40000KB (~40MB). Even the mobile phones of today should have no problems handling that. Consider completing the following code for your needs:
struct client {
struct sockaddr_storage addr;
socklen_t addr_len;
int fd;
/* Other stateful information */
};
#define BUFFER_SIZE 4096
#define CLIENT_COUNT 10000
int main(void) {
int server;
struct client client[CLIENT_COUNT] = { 0 };
size_t client_count = 0;
/* XXX: Perform usual bind/listen */
#ifdef WIN32
ioctlsocket(server, FIONBIO, (u_long[]){1});
#else
fcntl(server, F_SETFL, fcntl(server, F_GETFL, 0) | O_NONBLOCK);
#endif
for (;;) {
/* Accept connection if possible */
if (client_count < sizeof client / sizeof *client) {
struct sockaddr_storage addr = { 0 };
socklen_t addr_len = sizeof addr;
int fd = accept(server, &addr, &addr_len);
if (fd != -1) {
# ifdef WIN32
ioctlsocket(fd, FIONBIO, (u_long[]){1});
# else
fcntl(fd, F_SETFL, fcntl(fd, F_GETFL, 0) | O_NONBLOCK);
# endif
client[client_count++] = (struct client) { .addr = addr
, .addr_len = addr_len
, .fd = fd };
}
}
/* Loop through clients */
char buffer[BUFFER_SIZE];
for (size_t index = 0; index < client_count; index++) {
ssize_t bytes_recvd = recv(client[index].fd, buffer, sizeof buffer, 0);
# ifdef WIN32
int closed = bytes_recvd == 0
|| (bytes_recvd < 0 && WSAGetLastError() == WSAEWOULDBLOCK);
# else
int closed = bytes_recvd == 0
|| (bytes_recvd < 0 && errno == EAGAIN) || errno == EWOULDBLOCK;
# endif
if (closed) {
close(client[index].fd);
client_count--;
memmove(client + index, client + index + 1, (client_count - index) * sizeof client);
continue;
}
/* XXX: Process buffer[0..bytes_recvd-1] */
}
sleep(0); /* This is necessary to pass control back to the kernel,
* so it can queue more data for us to process
*/
}
}
Supposing you want to pool connections on the client-side, the code would look very similar, except obviously there would be no need for the accept
-related code. Supposing you have an array of client
s that you want to connect
, you could use non-blocking connect calls to perform all of the connections at once like this:
size_t index = 0, in_progress = 0;
for (;;) {
if (client[index].fd == 0) {
client[index].fd = socket(/* TODO */);
# ifdef WIN32
ioctlsocket(client[index].fd, FIONBIO, (u_long[]){1});
# else
fcntl(client[index].fd, F_SETFL, fcntl(client[index].fd, F_GETFL, 0) | O_NONBLOCK);
# endif
}
# ifdef WIN32
in_progress += connect(client[index].fd, (struct sockaddr *) &client[index].addr, client[index].addr_len) < 0
&& (WSAGetLastError() == WSAEALREADY
|| WSAGetLastError() == WSAEWOULDBLOCK
|| WSAGetLastError() == WSAEINVAL);
# else
in_progress += connect(client[index].fd, (struct sockaddr *) &client[index].addr, client[index].addr_len) < 0
&& (errno == EALREADY
|| errno == EINPROGRESS);
# endif
if (++index < sizeof client / sizeof *client) {
continue;
}
index = 0;
if (in_progress == 0) {
break;
}
in_progress = 0;
}
As for optimisation, given that this should be able to handle 10000 clients with perhaps a few minor tweaks, you shouldn't need multiple threads.
Nonetheless, by associating items from a mutex
collection with client
s and preceding the non-blocking socket operation with a non-blocking pthread_mutex_trylock
, the above loops could be adapted to run simultaneously in multiple threads whilst processing the same group of sockets. This provides a working model for all POSIX-compliant platforms, be it Windows, BSD or Linux, but it's not a perfectly optimal one. To achieve optimality, we must step into the asynchronous world, which varies from system to system:
WSA*
functions with call-backs.kqueue
and epoll
, respectively.It may pay to codify that "non-blocking socket operation" abstraction mentioned earlier, as the two asynchronous mechanisms vary significantly in respect to their interface. Like everything else, unfortunately we must write abstractions so that our Windows-relevant code remains legible on POSIX-compliant systems. As a bonus, this'll allow us to mingle server-processing (i.e. accept
and anything that follows) with client-processing (i.e. connect
and anything that follows), so our server loop can become a client loop (or vice-versa).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With