Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# NetworkStream.Read oddity

Can anyone point out the flaw in this code? I'm retrieving some HTML with TcpClient. NetworkStream.Read() never seems to finish when talking to an IIS server. If I go use the Fiddler proxy instead, it works fine, but when talking directly to the target server the .read() loop won't exit until the connection exceptions out with an error like "the remote server has closed the connection".

internal TcpClient Client { get; set; }

/// bunch of other code here...

try
{

NetworkStream ns = Client.GetStream();
StreamWriter sw = new StreamWriter(ns);

sw.Write(request);
sw.Flush();

byte[] buffer = new byte[1024];

int read=0;

try
{
    while ((read = ns.Read(buffer, 0, buffer.Length)) > 0)
    {
        response.AppendFormat("{0}", Encoding.ASCII.GetString(buffer, 0, read));
    }
}
catch //(SocketException se)
{

}
finally
{
    Close();
}

Update

In the debugger, I can see the entire response coming through immediately and being appended to my StringBuilder (response). It just appears that the connection isn't being closed when the server is done sending the response, or my code isn't detecting it.

Conclusion As has been said here, it's best to take advantage of the offerings of the protocol (in the case of HTTP, the Content-Length header) to determine when a transaction is complete. However, I've found that not all pages have content-length set. So, I'm now using a hybrid solution:

  1. For ALL transactions, set the request's Connection header to "close", so that the server is discouraged from keeping the socket open. This improves the chances that the server will close the connection when it is through responding to your request.

  2. If Content-Length is set, use it to determine when a request is complete.

  3. Else, set the NetworkStream's RequestTimeout property to a large, but reasonable, value like 1 second. Then, loop on NetworkStream.Read() until either a) the timeout occurs, or b) you read fewer bytes than you asked for.

Thanks to everyone for their excellent and detailed responses.

like image 835
3Dave Avatar asked Feb 03 '10 17:02

3Dave


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is C in C language?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.


3 Answers

Contrary to what the documentation for NetworkStream.Read implies, the stream obtained from a TcpClient does not simply return 0 for the number of bytes read when there is no data available - it blocks.

If you look at the documentation for TcpClient, you will see this line:

The TcpClient class provides simple methods for connecting, sending, and receiving stream data over a network in synchronous blocking mode.

Now my guess is that if your Read call is blocking, it's because the server has decided not to send any data back. This is probably because the initial request is not getting through properly.

My first suggestion would be to eliminate the StreamWriter as a possible cause (i.e. buffering/encoding nuances), and write directly to the stream using the NetworkStream.Write method. If that works, make sure that you're using the correct parameters for the StreamWriter.

My second suggestion would be not to depend on the result of a Read call to break the loop. The NetworkStream class has a DataAvailable property that is designed for this. The correct way to write a receive loop is:

NetworkStream netStream = client.GetStream();
int read = 0;
byte[] buffer = new byte[1024];
StringBuilder response = new StringBuilder();
do
{
    read = netStream.Read(buffer, 0, buffer.Length);
    response.Append(Encoding.ASCII.GetString(buffer, 0, read));
}
while (netStream.DataAvailable);
like image 165
Aaronaught Avatar answered Nov 15 '22 15:11

Aaronaught


Read the response until you reach a double CRLF. What you now have is the Response headers. Parse the headers to read the Content-Length header which will be the count of bytes left in the response.

Here is a regular expression that can catch the Content-Length header.

David's Updated Regex

Content-Length: (?<1>\d+)\r\n

Content-Length

Note

If the server does not properly set this header I would not use it.

like image 20
ChaosPandion Avatar answered Nov 15 '22 15:11

ChaosPandion


Not sure if this is helpful or not but with HTTP 1.1 the underlying connection to the server might not be closed so maybe the stream doesn't get closed either? The idea being that you can reuse the connection to send a new request. I think you have to use the content-length. Alternatively use the WebClient or WebRequest classes instead.

like image 31
Timbo Avatar answered Nov 15 '22 14:11

Timbo