Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TcpListener based application that does not scale up well

I have an ECHO server application based on a TCPListener. It accepts clients, read the data, and returns the same data. I have developed it using the async/await approach, using the XXXAsync methods provided by the framework.

I have set performance counters to measure how many messages and bytes are in and out, and how many connected sockets.

I have created a test application that starts 1400 asynchronous TCPClient, and send a 1Kb message every 100-500ms. Clients have a random waiting start between 10-1000ms at the beginning, so they not try to connect all at the same time. I works well, I can see in the PerfMonitor the 1400 connected, sending messages at good rate. I run the client app from another computer. The server's CPU and memory usage are very little, it is a Intel Core i7 with 8Gb of RAM. The client seems more busy, it is an i5 with 4Gb of RAM, but still not even the 25%.

The problem is if I start another client application. Connections start to fail in the clients. I do not see a huge increase in the messages per second (a 20% increase more or less), but I see that the number of connected clients is just around 1900-2100, rather than the 2800 expected. Performance decreases a little, and the graph shows bigger variations between max and min messages per second than before.

Still, CPU usage is not even the 40% and memory usage is still little. I have tried to increase the number or pool threads in both client and server:

ThreadPool.SetMaxThreads(5000, 5000);
ThreadPool.SetMinThreads(2000, 2000);

In the server, the connections are accepted in a loop:

while(true)
{
    var client = await _server.AcceptTcpClientAsync();
    HandleClientAsync(client);
}

The HandleClientAsync function returns a Task, but as you see the loop does not wait for the handling, just continues to accept another client. That handling function is something like this:

public async Task HandleClientAsync(TcpClient client)
{    
    while(ws.Connected && !_cancellation.IsCancellationRequested)
    {
        var msg = await ReadMessageAsync(client);
        await WriteMessageAsync(client, msg);
    }
}

Those two functions only read and write the stream asynchronously.

I have seen I can start the TCPListener indicating a backlog amount, but what is the default value?

Why could be the reason why the app is not scaling up till it reaches the max CPU?

Which would be the approach and tools to find out what the actual problem is?

UPDATE

I have tried the Task.Yield and Task.Run approaches, and they didn't help.

It also happens with server and client running locally in the same computer. Incrementing the amount of clients or messages per second, actually reduces the service throughput. 600 clients sending a message each 100ms, generates more throughput than 1000 clients sending a message each 100ms.

The exceptions I see on the client when connecting more than ~2000 clients are two. With around 1500 I see the exceptions at the beginning but the clients finally connect. With more than 1500 I see lot of connection/disconnection :

"An existing connection was forcibly closed by the remote host" (System.Net.Sockets.SocketException) A System.Net.Sockets.SocketException was caught: "An existing connection was forcibly closed by the remote host"

"Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host." (System.IO.IOException) A System.IO.IOException was thrown: "Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host."

UPDATE 2

I have set up a very simple project with server and client using async/await and it scales as expected.

The project where I have the scalability problem is this WebSocket server, and even when it uses the same approach, apparently something is causing contention. There is a console application hosting the component, and a console application to generate load (although it requires at least Windows 8).

Please note that I am not asking for the answer to fix the problem directly, but for the techniques or approaches to find out what is causing that contention.

like image 899
vtortola Avatar asked Feb 25 '14 11:02

vtortola


1 Answers

I have managed to scale up to 6,000 concurrent connections without problems and processing around 24,000 messages per second connecting from machine no machine (no localhost test) and using only around 80 physical threads.

There are some lessons I learnt:

Increasing the thread pool size made things worse

Do not do unless you know what you are doing.

Call Task.Run or yield with Task.Yield

To ensure you release the calling thread from attending the rest of the method.

ConfigureAwait(false)

From your executable application if you are confident you are not in a single threaded synchronization context, this allows any thread to pick up the continuation rather than wait specifically for the one that started to become free.

Byte[]

The memory profiler showed that the app was spending too much memory and time in creating Byte[] instances. So I designed several strategies to reuse the available ones, or just work "in place" rather than create new ones and copy. The GC performance counters (specifically "% time in GC", that was around 55%) raised the alarm that something was not right. Also, I was using BitArray instances to check bits in bytes, what caused some memory overhead as well, so I replace them with bit wise operations and it improved. Later on I discovered than WCF uses a Byte[] pool to cope with this problem.

Asynchronous does not mean fast

Asynchronous allows scale nicely, but it has a cost. Just because there is an available asynchronous operation does not mean you should use it. Use asynchronous programming when you presume it will take sometime waiting before getting the actual response. If you are sure the data is there or the response will be quick, proceed synchronously.

Support sync and async is tedious

You have to implement the methods twice, there is no bulletproof way of rehusing async from sync code.

like image 153
vtortola Avatar answered Oct 18 '22 02:10

vtortola