Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speeding up the rate that IIS/.NET/LINQ retrieves data from the Network Buffers

When doing a TCP analysis of the traffic between my web servers and database servers I see the network buffers (TCP Window) filling up frequently. The web servers then send TCP messages to the database server telling it that its buffers are full an not to send more data until given an update.

For example, this is the size of the network buffer in bytes for one of the more long lived connections to the database server over time:

Network Buffer Graph

The web servers are running a .NET 4.0 application running in IIS integrated mode on Windows 2008 R2 web servers. The SQL server is a 2008 R2 server.

My interpretation of this is that the SQL server is returning data to the web servers faster then the application on the web server can collect the data from the buffers. I have tried tuning just about everything I can in the network drivers to work around this issue. In particular increasing the RSS queues, disabling interrupt moderation, and setting Windows 2008 R2 server to increase the buffer size more aggressively.

So if my interpretation is correct that leaves me wondering about two possibilities:

  1. Is there any way in .NET to tell it to increase the size of the network buffers? The "enhanced 2008 R2 TCP stack" is rarely deciding to enable window scaling (Making the buffer bigger than 65 kBytes) for this connection (probably due to the low latency). It looks like the ability to manually set this system wide is gone in Windows server 2008 r2 (There used to be registry entries that are now ignored). So is there a way I can force this in the code?
  2. Is there anything that can be tuned that would speed up the rate at which the application reads information for the network buffers, in particular for the SQL connections?

Edit:
Requested DMV Query cutting off at ASYNC_NETWORK_IO:

SELECT * FROM sys.dm_os_wait_stats ORDER BY waiting_tasks_count desc;
wait_type  waiting_tasks_count  wait_time_ms  max_wait_time_ms  signal_wait_time_ms
CXPACKET            1436226309    2772827343             39259           354295135
SLEEP_TASK           231661274     337253925             10808            71665032
LATCH_EX             214958564     894509148             11855            84816450
SOS_SCHEDULER_YIELD  176997645     227440530              2997           227332659
ASYNC_NETWORK_IO     112914243      84132232             16707            16250951
like image 888
Kyle Brandt Avatar asked Mar 03 '11 00:03

Kyle Brandt


1 Answers

1) What makes you think that this is TCP flow control, as opposed to SQL Server not producing data in the intervals where there is no traffic? Check if sys.dm_exec_requests look at wait_type. The wait types are described in Waits and Queues. If is indeed the client applying TCP flow control, then you'll see the wait type ASYNC_NETWORK_IO.

2) If the issue is indeed network wait type, then the solution is not to increase the bandwidth, but obviously to reduce the traffic. The client has no business requesting so much data from the server as to cause TCP flow control. This would be caused by doing horribly wrong things in the client, like counting rows or client-side paging. Move the processing on the server, and just get small results sets with the data you need.

Edit

Consuming the DB call result set ultimately boils down to one form or another of this:

FetchNextRow
while (not EnfOfResults)
{
  ProcessRow;
  FetchNextRow;
}

What this could mean, in real terms, it could be foreach row in IQueryable or SqlDataReader.Read(). But the basic idea is the same, that the client fetches rows from the result, process them, then gets some more rows. If the client code does anything in that ProcessRow that blocks, then the client code will not reach the point where it fetches the next row again, and thus will eventually trigger TCP flow control which in turn will cause SQL Server to suspend the query (since it has no place to write the results into). There is nothing you can do in terms of TCP to make this better. Increasing the window size can actually make maters worse, as now all those results that were previously suppressed at source (DB) are going to be created and have to be stored somewhere, which will ultimately mean live memory allocated to storage and may make things far worse than they are now.

If I would be in your shoes right now I'd focus on identifying where does that ProcessRow blocking occur. An hypothesis I put forward was that that processing would be an MVC View writing into the response buffer and being blocked in turn by TCP flow control resulting from the user agent not consuming the HTTP response (eg. Ajax call completed but the browser is not running the completion code to consume the response because the main thread is looping busy on something else). As always, the best approach is to methodically measure. Some possible tools:

  • Look at ASP.Net performance counters.
  • Instrument your code with your own performance counters (is much easier than most developer think, see Using XSLT to generate Performance Counters code)
  • Non-invasive sampled performance traces, see Rapid Web Site Profiling with VSPerfASPNETCmd
like image 105
Remus Rusanu Avatar answered Oct 29 '22 02:10

Remus Rusanu