Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HttpWebRequest and I/O completion ports

I'm working on an application that requires for one type of message to go hit a database, and the other type of message to go and hit some external xml api.

I have to process A LOT... one of the big challenges is to get HttpWebRequest class performing well. I initially started with just using the standard synchronous methods and threadpooling the whole thing. This was not good.

So after a bit of reading I saw that the recommended way to do this was to use the Begin/End methods to delegate the work to IO completion ports, thus freeing up the threadpool and yielding better performance. This doesn't seem to be the case... the performance is marginally better but I certainly can't see the IO completion ports being used that much compared to threadpool.

I have a thread that spins round and sends me the available worker threads + completion ports in the threadpool. Completion ports is always very low (max I've seen is 9 used) and I'm always using about 120 worker threads (sometimes more). I use the begin / end pattern for all methods in httpwebrequest:

Begin/EndGetRequestStream
Begin/EndWrite (Stream)
Begin/EndGetResponse
Begin/EndRead (Stream)

Am I doing it right? Am I missing something? I can use (sometimes) up to 2048 http connections simultaneously (from netstat output) - why would the completion port numbers be so low?

If anyone could give some serious advice about how to do well with this managing worker threads, completion ports and httpwebrequest it would be hugely appreciated!

EDIT: is .NET a reasonable tool for this? Can I get a high volume of httpconnections working with .NET and the System.Net stack? It's been suggested to use something like WinHttp (or some other C++ library), and pInvoke it from .NET, but this isn't something I especially want to do!

like image 221
peteisace Avatar asked Jan 24 '11 07:01

peteisace


1 Answers

The way I understand it, you don't tie up an I/O completion port all the time that an asynchronous request is outstanding - it's only "busy" when data has been returned and is being processed on the corresponding thread. Hopefully you don't have very much work to do in the callback, which is why you don't have many in-use ports at any one time.

Are you actually getting poor performance though? Is your cause for concern merely the low numbers? Are you getting the throughput you'd expect?

One problem you may have is that the HTTP connection pool for any one host is relatively small. If you have hundreds of requests to the same machine, then by default only 2 requests will actually be made at a time, to avoid DoS-attacking the host in question (and to get the benefits of keep-alive). You can increase this programmatically or using app.config. Of course, this may not be an issue in your case, either because you've already fixed the problem or because all your requests are to different hosts. (If netstat is showing 2048 connections then that doesn't sound bad.)

like image 73
Jon Skeet Avatar answered Oct 10 '22 03:10

Jon Skeet