Background/Intro:
Very strange issue, where upon a specific exception being thrown, it appears all other threads cease to execute until the exception is handled. The application is a proxy server, piping multiple requests (many hundreds in any one time frame) via a pool of tcp sockets, which connect to N other proxy clients over a socks connection. This has also been tried by passing a delegate, as opposed to using an exception, with the same poor performance results.
A log snippet under normal operation:
14:40:17.700 [PrxSvc:9058] --> [200] 1217ms http://redact.example.com
14:40:17.700 [PrxSvc:9058] C-DEBUG:C
14:40:17.716 [PrxSvc:9058] --> [200] 1098ms http://redact.example.com
14:40:17.716 [PrxSvc:9058] C-DEBUG:C
14:40:17.727 [PrxSvc:9054] --> [200] 905ms http://redact.example.com
14:40:17.727 [PrxSvc:9054] C-DEBUG:C
14:40:17.778 [PrxSvc:9050] --> [200] 453ms http://redact.example.com
14:40:17.778 [PrxSvc:9050] C-DEBUG:C
14:40:17.781 [Unnamed Thread] C-DEBUG:A
14:40:17.781 [Unnamed Thread] C-DEBUG:B
14:40:17.796 [PrxSvc:9058] --> [200] 652ms http://redact.example.com
14:40:17.796 [PrxSvc:9058] C-DEBUG:C
14:40:17.807 [PrxSvc:9056] --> [200] 1555ms http://redact.example.com
14:40:17.807 [PrxSvc:9056] C-DEBUG:C
14:40:17.816 [PrxSvc:9064] --> [200] 396ms http://redact.example.com
The socket pool resuses connections to domains, however when a connection is closed by an external server, we obviously receive no notification of this. When we try to reuse this connection via our TcpSocksHandler.TaskHandler method:
socks.Send(buffer, 0, rcv, SocketFlags.None);
socks.Receive(new byte[1], 0, 1, SocketFlags.Peek);
This exception is caught with the following:
catch
{
//The socket is bad, we should get a new socket.
Log("This socket has expired! - Server has closed the connection.");
Log(String.Format("This socket was {0} seconds old: {1}", seconds, socks.Guid));
socks.Dispose();
Log("C-DEBUG:1");
throw new SocksSocketFailureException(); //throw exception. to bubble back up.
}
This is then caught by it's calling code, multiple times up the stack as follows:
DoHandleRequest{...
try
{
_actualEndPoint = TcpSocksHandler.TaskHandler(socket, context.SocksSocket, 30000000, method);
}
catch (SocksSocketFailureException ssfe)
{
context.SocksSocket = null;
Logger.Log("C-DEBUG:2");
throw;
}
}
ProxyBase.HandleRequest{...
try
{
...
success = DoHandleRequest(context, out bytes);
}
catch (SocksSocketFailureException ssfex)
{
Logger.Log("C-DEBUG:3");
throw;
}
}
ProxyManager.HandleRequest{
while (true)
{
// Pick the best available proxy to handle the request
Logger.Log("C-DEBUG:A");
IProxy proxy = GetNextProxy(context) ?? NullProxy.Instance;
Logger.Log("C-DEBUG:B");
try
{
// Process the request
proxy.HandleRequest(context);
Logger.Log("C-DEBUG:C");
break;
}
catch(SocksSocketFailureException ssfex)
{
Logger.Log("C-DEBUG:4");
}
catch (Exception)
{
break;
}
}
}
From the log snippet below you can see the performance of this (or lack thereof)
14:40:34.090 [PrxSvc:9068] C-DEBUG:A
14:40:34.090 [PrxSvc:9068] C-DEBUG:B
14:40:34.231 [PrxSvc:9068] This socket has expired! - Server has closed the connection.
14:40:34.231 [PrxSvc:9068] This socket was 6.281049 seconds old: 61cc51b9-f121-4529-9649-7defcc1f5586
14:40:34.231 [PrxSvc:9068] C-DEBUG:1
14:40:34.528 [PrxSvc:9068] C-DEBUG:2
14:40:34.715 [PrxSvc:9068] C-DEBUG:3
14:40:34.918 [PrxSvc:9068] C-DEBUG:4
14:40:34.918 [PrxSvc:9068] C-DEBUG:A
No log lines have been edited from the above - Nothing else is processed for pretty much the entire second! (where we could usually handle maybe a hundred requests). In addition, the mere act of bubbling an exception up the stack seems to take a second (sometimes more!). - Please take note of the timings of the above log lines. e.g. .2 seconds gap between DEBUG:3 and DEBUG:4 !!
I have NO idea what could be causing this. ANY advice/ideas would be much appreciated!
UPDATE:
Based an Eamon's question, I pushed the same x64 build to a non-production local machine, running Windows8 64bit. Installed as a service, release build, as the previous example is. The only other difference is it's now targeting 4 proxy nodes (PrxSvc 9050, 9052, 9054, 9056), as opposed to the previous 80. I can no longer say if those exception steps are blocking thread execution now, because they are executed in a timely fashion:
16:53:59.787 [PrxSvc:9056] This socket has expired! - Server has closed the connection.
16:53:59.787 [PrxSvc:9056] This socket was 0.1280009 seconds old: 69d12cc9-9456-47db-86b2-a2ebf87b41bf
16:53:59.787 [PrxSvc:9056] C-DEBUG:1
16:53:59.787 [PrxSvc:9056] C-DEBUG:2
16:53:59.787 [PrxSvc:9056] C-DEBUG:3
16:53:59.787 [PrxSvc:9056] C-DEBUG:4
16:53:59.787 [PrxSvc:9056] C-DEBUG:A
On this machine, the code is currently happily processing 800 concurrent requests at a rate of around 80 requests per second, and could easily process more... What on earth could be causing this difference?!
For completeness I reran the first test (on the win2008 server) with 4 nodes, instead of 80, and got the same rubbish result:
17:22:44.891 [PrxSvc:9054] C-DEBUG:B
17:22:45.063 [PrxSvc:9054] This socket has expired! - Server has closed the connection.
17:22:45.063 [PrxSvc:9054] This socket was 25.84375 seconds old: cfdee74d-9941-4c8c-80cd-f32aa14b7877
17:22:45.063 [PrxSvc:9054] C-DEBUG:1
17:22:45.485 [PrxSvc:9054] C-DEBUG:2
17:22:45.751 [PrxSvc:9054] C-DEBUG:3
17:22:46.016 [PrxSvc:9054] C-DEBUG:4
This is not an answer for your question, what the error in your program is, but that I can hardly tell without seeing the big picture or test your code myself. Normally I would write a comment for this, but my text is too long.
Your first example log ("normal operation") is a bit confusing for me. There are just one C-DEBUG:A
and C-DEBUG:B
while at least for every C-DEBUG:C
there has to be one of each, hasn't it?
The other examples are looking good, everything occurs as I would expect (A => B => exception => 1 => 2 => 3 => 4 => A). Well, it seems there is just one thread, but the examples give no clue that this is wrong; i see no second C-DEBUG:A
from another thread. What do you expect instead?
Regarding to your update: Here I'm wondering more about the performance of your test machine, because every throw of an exception needs some execution time. This is the reason why throwing exceptions in a loop is a big performance hit. The times of your server seems a bit slow, but ok from my point of view.
While beeing not able to give you a hint for your problem, at least I can give you the statement that the exception throwing itself is not the blocking factor for other threads. To prove that I wrote a little program (see source code below if you want to know how I've done that). The output of that program was
19:31:09.2788 [Thread-0] 0
19:31:09.2788 [Thread-1] 1
19:31:09.3908 [Thread-0] 0
19:31:09.3908 [Thread-1] 1
19:31:09.4908 [Thread-1] 1
19:31:09.4908 [Thread-0] 0
19:31:09.5908 [Thread-0] 0
19:31:09.5998 [Thread-1] Caught exception callstack frame 29
19:31:09.6078 [Thread-1] Caught exception callstack frame 28
19:31:09.6148 [Thread-1] Caught exception callstack frame 27
19:31:09.6218 [Thread-1] Caught exception callstack frame 26
19:31:09.6288 [Thread-1] Caught exception callstack frame 25
19:31:09.6358 [Thread-1] Caught exception callstack frame 24
19:31:09.6418 [Thread-1] Caught exception callstack frame 23
19:31:09.6488 [Thread-1] Caught exception callstack frame 22
19:31:09.6548 [Thread-1] Caught exception callstack frame 21
19:31:09.6608 [Thread-1] Caught exception callstack frame 20
19:31:09.6668 [Thread-1] Caught exception callstack frame 19
19:31:09.6728 [Thread-1] Caught exception callstack frame 18
19:31:09.6778 [Thread-1] Caught exception callstack frame 17
19:31:09.6828 [Thread-1] Caught exception callstack frame 16
19:31:09.6888 [Thread-1] Caught exception callstack frame 15
19:31:09.6908 [Thread-0] 0
19:31:09.6938 [Thread-1] Caught exception callstack frame 14
19:31:09.6978 [Thread-1] Caught exception callstack frame 13
19:31:09.7028 [Thread-1] Caught exception callstack frame 12
19:31:09.7078 [Thread-1] Caught exception callstack frame 11
19:31:09.7128 [Thread-1] Caught exception callstack frame 10
19:31:09.7168 [Thread-1] Caught exception callstack frame 9
19:31:09.7218 [Thread-1] Caught exception callstack frame 8
19:31:09.7258 [Thread-1] Caught exception callstack frame 7
19:31:09.7299 [Thread-1] Caught exception callstack frame 6
19:31:09.7339 [Thread-1] Caught exception callstack frame 5
19:31:09.7369 [Thread-1] Caught exception callstack frame 4
19:31:09.7409 [Thread-1] Caught exception callstack frame 3
19:31:09.7439 [Thread-1] Caught exception callstack frame 2
19:31:09.7469 [Thread-1] Caught exception callstack frame 1
19:31:09.7499 [Thread-1] Caught exception callstack frame 0
19:31:09.7509 [Thread-1] 1
19:31:09.7919 [Thread-0] 0
19:31:09.8509 [Thread-1] 1
19:31:09.8919 [Thread-0] 0
19:31:09.9509 [Thread-1] 1
19:31:10.0509 [Thread-1] 1
19:31:10.1509 [Thread-1] 1
19:31:10.2509 [Thread-1] 1
19:31:10.3509 [Thread-1] 1
As you can see, thread Thread-0 prints a 0
while the exception of Thread-1 works its way up the callstack. No blocking here!
For reference here my program:
class Program {
class MyException : Exception {}
// A class for give the starting thread operation some parameters
class ThreadStartParameter {
// For identifying each thread
public int Id { get; set; }
// For building up a deeper callstack frame
public int CallStackDepth { get; set; }
// Indicates that this thread should throw an exception
public bool ThrowException { get; set; }
}
static void Main(string[] args) {
// Create two threads and let them run concurrently
Thread t0 = new Thread(BuildUpCallStack) { Name = "Thread-0" };
Thread t1 = new Thread(BuildUpCallStack) { Name = "Thread-1" };
t0.Start(new ThreadStartParameter {
Id = 0,
CallStackDepth = 0,
ThrowException = false
});
t1.Start(new ThreadStartParameter {
Id = 1,
CallStackDepth = 0,
ThrowException = true
});
Console.Read();
}
// Recursive helper method to build a callstack of 30 frames, which
// is used to rethrow an exception at each level
static void BuildUpCallStack(object data) {
ThreadStartParameter parameter = (ThreadStartParameter)data;
if (parameter.CallStackDepth < 30) {
try {
BuildUpCallStack(new ThreadStartParameter {
Id = parameter.Id,
CallStackDepth = parameter.CallStackDepth + 1,
ThrowException = parameter.ThrowException
});
} catch (MyException e) {
Log(string.Format("Caught exception callstack frame {0}",
parameter.CallStackDepth));
// If an exception occured, rethrow it unless this
// callstack frame was the first
if (parameter.CallStackDepth != 0) throw;
// If this frame was the first in callstack, restart the
// thread but this time without throwing an exception (for
// demonstrate such a restart character like your Proxies do)
BuildUpCallStack(new ThreadStartParameter {
Id = parameter.Id,
CallStackDepth = 0,
ThrowException = false
});
}
return;
}
DoSomething(parameter);
}
static void DoSomething(object data) {
ThreadStartParameter parameter = (ThreadStartParameter)data;
for (int counter = 0; counter < 7; counter++) {
if (counter == 3 && parameter.ThrowException)
throw new MyException();
Log(parameter.Id);
Thread.Sleep(100);
}
}
static void Log(object message) {
Console.WriteLine(
" {0:HH:mm:ss.ffff} [{1}] {2}",
DateTime.Now,
Thread.CurrentThread.Name,
message.ToString());
}
}
If you try this program on your test machine, I guess you should start more "good" threads (which don't throw an exception) and/or increase the needed callstack depth for seeing the effect like I did.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With