Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

High performance asynchronous awaiting sockets

I am writing an app that will require to make hundreds of socket connections over tcp to read/write data.

I have come across this code snippet here and I'm wondering how I can make this more robust.

This is currently how I am calling the code:

foreach (var ip in listofIps)
{
   IPEndPoint remoteEP = new IPEndPoint(IPAddress.Parse(ip), 4001);
   Socket client = new Socket(AddressFamily.InterNetwork,
                           SocketType.Stream, ProtocolType.Tcp);
   client.Connect(remoteEP);
   await ReadAsync(client);
}
  1. Is there anything wrong with the above, and how can it be optimized such that it runs concurrently?

    In the code snippet, the buffer size is set to 1000. Just as a simple illustration, if I were to attempt to print out only the bytes received, and not the remaining 0x00s, I have to do something like this:

    while (true)
    {
        await s.ReceiveAsync(awaitable);
        int bytesRead = args.BytesTransferred;
        if (bytesRead <= 0) break;
        var hex = new StringBuilder(bytesRead * 2);
        var msg = new byte[bytesRead];
    
        for (int i = 0; i < bytesRead; i++)                
            msg[i] = args.Buffer[i];                
    
        foreach (byte b in msg)                
            hex.AppendFormat("{0:x2} ", b);
    
        AppendLog(string.Format("RX: {0}", hex));
    }
    
  2. Is there a more efficient way of doing this? Previously, I would iterate the whole buffer and print out the data, but that will give me a whole bunch of trailing 0x00s as my protocol is anywhere between 60 to 70 bytes long.

like image 488
Null Reference Avatar asked Jun 13 '13 17:06

Null Reference


1 Answers

I am writing an app that will require to make hundreds of socket connections over tcp to read/write data.

You don't need "high-performance sockets" for that. The code is far simpler with regular-performance sockets.

For starters, don't use the custom awaitables from the link you posted. They are perfectly fine for some people (and completely "robust"), but you don't need them and your code will be simpler without them.

  1. Is there anything wrong with the above, and can it be further optimized?

Yes. You shouldn't mix blocking (Connect) and asynchronous (ReadAsync) code. I would recommend something like this:

foreach (var ip in listofIps)
{
  IPEndPoint remoteEP = new IPEndPoint(IPAddress.Parse(ip), 4001);
  Socket client = new Socket(AddressFamily.InterNetwork,
                             SocketType.Stream, ProtocolType.Tcp);
  await client.ConnectTaskAsync(remoteEP);
  ...
}

Where ConnectTaskAsync is a standard TAP-over-APM wrapper:

public static Task ConnectTaskAsync(this Socket socket, EndPoint endpoint)
{
  return TaskFactory.FromAsync(socket.BeginConnect, socket.EndConnect, endpoint, null);
}

As Marc Gravell pointed out, this code (and your original code) is connecting the sockets one at a time. You could use Task.WhenAll to connect them all simultaneously.

2) Is there a more efficient way of doing this?

First, you should define a TAP-over-APM ReceiveTaskAsync wrapper similar to the above. When dealing with binary data, I also like to have an extension method on byte arrays for dumping:

public string DumpHex(this ArraySegment<byte> data)
{
  return string.Join(" ", data.Select(b => b.ToString("X2")));
}

Then you can have code like this:

while (true)
{
  int bytesRead = await socket.ReceiveTaskAsync(buffer);
  if (bytesRead == 0) break;
  var data = new ArraySegment<byte>(buffer, 0, bytesRead);
  AppendLog("RX: " + data.HexDump());
  ...
}

If you do a lot of binary manipulation, you may find my ArraySegments library helpful.

3) Where and how should I include the logic to check if my whole data has arrived within a single read

Oh, it's more complex than that. :) Sockets are a stream abstraction, not a message abstraction. So if you want to define "messages" in your protocol, you need to include a length prefix or delimiter byte so you can detect the message boundaries. Then you need to write code that will parse out your messages, keeping in mind that blocks of data read from the socket may contain only a partial message (so you have to buffer it), a complete message, multiple complete messages, and may also end with a partial message (again, buffering). And you have to also consider your existing buffer when receiving the new block.

I have a TCP/IP .NET Sockets FAQ on my blog that addresses this specifically and has some example code using my personal default preference for message framing (4-byte little-endian length prefixing).

4) How should I include a writeasync method such that I can send data through the socket in the middle of reads.

That one's surprisingly tricky:

public static Task<int> SendTaskAsync(this Socket socket, byte[] buffer, int offset, int size, SocketFlags flags)
{
  return Task<int>.Factory.FromAsync(socket.BeginSend, socket.EndSend, buffer, offset, size, flags, null);
}
public static Task WriteAsync(this Socket socket, byte[] buffer)
{
  int bytesSent = 0;
  while (bytesSent != buffer.Length)
  {
    bytesSent += await socket.SendTaskAsync(data, bytesSent, buffer.Length - bytesSent, SocketFlags.None);
  }
}
like image 179
Stephen Cleary Avatar answered Oct 01 '22 04:10

Stephen Cleary