Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to force the application to run as single threaded?

We have an old project that we are supporting and there is an issue that occurs most probably due to multi-threading. The original implementer 'fixed' it by doing a Thread.sleep before executing the problematic section. The workaround works but as the section is inside a loop the thread.sleep adds multiple minutes to the time it takes for the section to finish.

In the last month we have been we have been experimenting with lower values for the sleep but we wish to find the root cause. During our investigations we were doing lock on private objects wherever we felt like that would help. We looked for anything that might be spawning additional threads - found none. No Thread.start and no ThreadPool usage. What is confusing us is that during debugging we find our main thread in the middle of about 8 other threads that we don't know who spawned them. These are background threads so first thought I had was the threadpool but as I mentioned no mention of it in the code.

It is .net 2.0 so no Asyncs. This is just a part of the bigger application so it is a windows service but we run it as CMD to be able to debug it easily The main application itself is a windows forms desktop app. It also uses COM+ components if that is any help.

I've tried [STA] instead of [MTA]. Also Locking as aforementioned. MemoryBarriers as well.

We still get the issue.

The issue is basically corrupted datasets and nulls in objects where they shouldn't be. It happens in about once every 25-100 iterations so reproduction is not straight forward but we have devised a test specifically for this issue to try to reproduce it.

All that is pointing us into the direction of thread issues.

Back to the original question - Who could possibly by spawning those additional threads and how do we prevent these threads for being created?

enter image description here

Please note the threads marked with red - those are background threads and as far as we can see no mention of them in the code.

The suspected thread in the screenshot is actively modifying the cols in the dataset. Problem is - the methods calling the SetColValueOnRow function that the thread is executing are typical and don't use any kind of threading.

The CPU affinity for this application is set to 1 Core [part of the original work-around]

Thanks

Edit: The database is oracle 12c but the issues we face happen before writing to the database. They usually happen in DataSets where a whole record or a few of its columns can be wiped once every few testing iterations

like image 564
AngelicCore Avatar asked Apr 06 '17 23:04

AngelicCore


1 Answers

I think you need to investigate why Thread.sleep works. It does not sound like the code is by itself spawning additional threads, but you would have to go through the entire code base to find that out - including the COM+ components.

So the first thing I would do is to start up the program in debug and just press the F10 key to step into the program. Then open up the threads debug window and see if you see about the same number of threads as given in your question. If you do, then those are simply threads from the thread pool and your issue is probably unrelated to the multiple threads.

If you don't see the same number of threads, then try setting a breakpoint at various stages of the program and see if you can find where those threads are getting created. When you find where they are getting created, you can try adding some locking at that point. But, your issue still might not be caused by multiple threads corrupting memory. You should investigate until you are convinced that the issue is due to multiple threads or something else.

I suspect that the issue might be related to one or more of the COM+ components or maybe the code is calling some long running database stored procedure. In any case, I suspect the reason why Thread.sleep works is because it is giving the suspect component enough time to complete its operation before starting on the next operation.

If this theory is true, then it suggests that there is some interaction between operations and when Thread.Sleep is given a sufficiently large value to allow the operation to complete - there are no interaction issues. This also suggests that perhaps one of the COM+ components is doing some things asynchronously. The solution might be to use locks or critical sections inside the COM+ components code. Another idea is to redesign the section of code that is causing the problem to allow multiple operations simultaneously.

So, the problem you are experiencing may not be due to multiple threads in the C# code you are looking at - but might be due to a long-running operation that will sometimes fail if not given sufficient time to complete before starting the next operation. This may or may not be due to multiple threads in the C# code.

like image 115
Bob Bryan Avatar answered Sep 28 '22 09:09

Bob Bryan