Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I write a scalable socket server using C# 4.0?

I want to write a simple socket server, however I'd like it to be vertically scalable, for example, not creating a thread per connection or very long running tasks, which may consume all threads.

The server receives a request containing a query and streams an arbitrarily large result.

I would like the idiomatic way to do this using techniques and libraries that are available in C# 4, with an emphasis on simple code, rather than raw performance.

Reopening A socket server is a useful part of a scalable system. If you want to scale horizontally, there are different techniques. You should probably won't be able to answer this question if you have never created a socket server.

like image 428
Dave Hillier Avatar asked Oct 04 '11 22:10

Dave Hillier


1 Answers

I've been working on something similar for a week or two now so hopefully I'll be able to help you out a bit.

If your focus is on simple code, I'd recommend using the TcpClient and TcpListener classes. They both make sockets much easier to work with. While they have existed since .NET Framework 1.1 they have been updated and are still your best bet.

In terms of how to utilize the .NET Framework 4.0 in writing simplistic code, Tasks are the first thing that come to mind. They make writing asynchronous code much less painful and it will become much easier to migrate your code once C# 5 comes out (new async and await keywords). Here is an example of how Tasks can simplify your code:

Instead of using tcpListener.BeginAcceptTcpClient(AsyncCallback callback, object state); and providing a callback method which would call EndAcceptTcpClient(); and optionally cast your state object, C# 4 allows you to utilize closures, lambdas, and Tasks to make this process much more readable and scalable. Here is an example:

private void AcceptClient(TcpListener tcpListener)
{
    Task<TcpClient> acceptTcpClientTask = Task.Factory.FromAsync<TcpClient>(tcpListener.BeginAcceptTcpClient, tcpListener.EndAcceptTcpClient, tcpListener);

    // This allows us to accept another connection without a loop.
    // Because we are within the ThreadPool, this does not cause a stack overflow.
    acceptTcpClientTask.ContinueWith(task => { OnAcceptConnection(task.Result); AcceptClient(tcpListener); }, TaskContinuationOptions.OnlyOnRanToCompletion);
}

private void OnAcceptConnection(TcpClient tcpClient)
{
    string authority = tcpClient.Client.RemoteEndPoint.ToString(); // Format is: IP:PORT

    // Start a new Task to handle client-server communication
}

FromAsync is very useful as Microsoft has provided many overloads that can simplify common asynchronous operations. Here's another example:

private void Read(State state)
{
    // The int return value is the amount of bytes read accessible through the Task's Result property.
    Task<int> readTask = Task<int>.Factory.FromAsync(state.NetworkStream.BeginRead, state.NetworkStream.EndRead, state.Data, state.BytesRead, state.Data.Length - state.BytesRead, state, TaskCreationOptions.AttachedToParent);

    readTask.ContinueWith(ReadPacket, TaskContinuationOptions.OnlyOnRanToCompletion);
    readTask.ContinueWith(ReadPacketError, TaskContinuationOptions.OnlyOnFaulted);
}

State is just a user-defined class that usually just contains the TcpClient instance, the data (byte array), and perhaps the bytes read as well.

As you can see, ContinueWith can be used to replace a lot of cumbersome try-catches that until now were a necessary evil.

At the beginning of your post you mentioned not wanting to create a thread per connection or create very long running tasks and I thought I would address that at this point. Personally, I don't see the problem with creating a thread for each connection.

What you must be careful with, however, is using Tasks (an abstraction over the ThreadPool) for long-running operations. The ThreadPool is useful because the overhead of creating a new Thread is not negligible and for short tasks such as reading or writing data and handling a client connection, Tasks are preferred.

You must remember that the ThreadPool is a shared resource with a specialized function (avoiding the overhead of spending more time creating a thread than actually using it). Because it is shared, if you used a thread, another resource cannot and this can quickly lead to thread-pool starvation and deadlock scenarioes.

like image 177
Ryan Peschel Avatar answered Oct 22 '22 17:10

Ryan Peschel