Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does NetworkStream Read like this?

I have an application that sends messages that are newline terminated over a TCP socket using TCPClient and it's underlying NetworkStream.

The data is streaming in at roughly 28k every 100ms from a realtime data stream for monitoring.

I've stripped out the irrelevant code, this is basically how we read the data:

TcpClient socket; // initialized elsewhere
byte[] bigbuffer = new byte[0x1000000];
socket.ReceiveBufferSize = 0x1000000;
NetworkStream ns = socket.GetStream();
int end = 0;
int sizeToRead = 0x1000000;
while (true)
{
  bytesRead = ns.Read(bigbuffer, end, sizeToRead);
  sizeToRead -= bytesRead;
  end += bytesRead;

  // check for newline in read buffer, and if found, slice it up, and return
  // data for deserialization in another thread

  // circular buffer
  if (sizeToRead == 0)
  {
    sizeToRead = 0x1000000;
    end = 0;
  }
}

The symptom we were seeing, somewhat intermittently based on the amount of data we were sending back, is that there would be a 'lag' of records, where the data we're reading from the stream get progressively older and older versus what we are delivering (after a few minutes of streaming, the lag is in order of 10s of seconds), until eventually it all catches up in one big shot, and the cycle repeats.

We fixed it by maxing out sizeToRead, and (whether or not this is required, I'm not sure, but we did it anyway), removed the ReceiveBufferSize set on TcpClient and kept it at the default 8192 (changing just ReceiveBufferSize did not correct it).

int sizeForThisRead = sizeToRead > 8192 ? 8192 : sizeToRead;
bytesRead = ns.Read(bigBuffer, end, sizeForThisRead);

I thought maybe it was the interaction with nagle and delayed ack, but wireshark showed that data was coming in just fine based on the timestamps and looking into the data (which is timestamped, and the server and clients clocks are synchronized within a second).

We output logs after the ns.Read, and for sure the issue is at the Read call and not the deserialization code.

So what this leads me to believe is that if you set the TcpClient's ReceiveBufferSize really big, and in your Read call on it's underlying NetworkStream pass bytesToRead to be many more bytes than expected to arrive, there's a timeout going on at the Read call waiting for those bytes to arrive, but it still doesn't return everything in the stream? Each successive call in this loop is timing out, until the 1 meg buffer is full, after which when 'end' gets reset back to 0, it inhales everything that's left in the stream causing it all to catch up - but it shouldn't do that because the logic, to me, looks like it should empty the stream completely on the next iteration (because the next sizeToRead will still be > data available in the buffer).

Or maybe it's something I'm not thinking about that I can't synthesize - but perhaps those clever souls here may think of something.

Or maybe this is expected behaviour - if so, why?

like image 832
paquetp Avatar asked Apr 29 '14 13:04

paquetp


1 Answers

This behavior was so interesting that I just had to see it for myself, and... I couldn't.

This anti-answer presents an alternative theory that may explain the lag described in the question. I had to infer some details from the question and comments.

The target application is an interactive UI application with three threads of operation:

  1. A TcpClient network data consumer.
  2. A data queue consumer thread that delivers results to the UI.
  3. The UI thread.

For the purposes of this discussion, assume that TheDataQueue is a BlockingCollection<string> instance (any thread-safe queue would do):

BlockingCollection<string> TheDataQueue = new BlockingCollection<string>(1000);

The application has two synchronous operations that block while waiting for data. The first is the NetworkStream.Read call that is the primary subject of the question:

bytesRead = ns.Read(bigbuffer, end, sizeToRead);

The second blocking operation occurs when data in the work queue is marshalled to the UI for display. Let's suppose the code looks like this:

// A member method on the derived class of `System.Windows.Forms.Form` for the UI.
public void MarshallDataToUI()
{
    // Current thread: data queue consumer thread.
    // This call blocks if the data queue is empty.
    string text = TheDataQueue.Take();

    // Marshall the text to the UI thread.
    Invoke(new Action<string>(ReceiveText), text);
}

private void ReceiveText(string text)
{
    // Display the text.
    textBoxDataFeed.Text = text;

    // Explicitly process all Windows messages currently in the message queue to force
    // immediate UI refresh.  We want the UI to display the very latest data, right?
    // Note that this can be relatively slow...
    Application.DoEvents();
}

In this application design, the observed lag occurs when the network delivers data to TheWorkQueue faster than the UI can show it.

Why might @paquetp's logs show a problem with NetworkStream.Read?

NetworkStream.Read blocks until data is available. If the logs report the elapsed time while waiting for more data, then there will be an apparent delay. But the TcpClient network buffer is actually empty because the application has already read and queued the data. If the realtime data stream is bursty, then this will happen often.

How do you explain that eventually it all catches up in one big shot?

This is a natural consequence of the data queue consumer thread working through the backlog in TheDataQueue.

But what about the packet capture and data timestamps?

If an item is backlogged in TheDataQueue, the data timestamps are correct. But you can't see them yet in the UI. The packet capture timestamps are timely because the network data was received and queued quickly by the network thread.

Isn't this all just guesswork?

Nope. There is a pair of custom apps (producer and consumer) that demonstrate this behavior.

Network Consumer App Screenshot

The screenshot shows that the data queue is backlogged by 383 items. The data timestamp lags the current timestamp by about 41 seconds. I paused the producer a few times to simulate bursty network data.

However, I was never able to make NetworkStream.Read behave as the question supposed.

like image 174
Joel Allison Avatar answered Sep 28 '22 07:09

Joel Allison