I am testing out the idea of threads, but only in very key spots right now. Threads add a pretty fascinating level of complexity to just about anything, but with .NET, it seems there are many choices for threads within System.Threading. I'm looking to know which is the best for handing string operations.
Consider a complex string being fed to a custom object. That object currently splits the string at some point, and feeds part one to a function, then when that function completes, feeds the other half of the string to a second function. The two functions have no dependencies on each other, so should be good candidates for threading so that both functions can work concurrently on each piece of the string.
Example before theading:
Public Sub ParseString(ByVal SomeStr As String)
If String.IsNullOrWhitespace(SomeStr) Then
Throw New ArgumentNullException("SomeStr")
End If
' Assume that ParsedFirstString is a boolean that is set to
' True if the call to ParseFirstString completes successfully.
' Ditto for ParsedSecondString.
Dim MyDelimiter As Char = "|"c
Dim SomeStrArr As String() = SomeStr.Split({MyDelimiter}, 2)
Call Me.ParseFirstString(SomeStrArr(0))
If Me.ParsedFirstString = False Then
Throw New ArgumentException("Failed to parse the first part of the string.")
End If
Call Me.ParseSecondString(SomeStrArr(1))
If Me.ParsedSecondString = False Then
Throw New ArgumentException("Failed to parse the second part of the string.")
End If
End Sub
This works fine, and testing inside a timing loop on my multicore system, I can execute it 1,000 times in ~140ms-170ms (avg ~1,200ms+ if 10,000 times). This is an acceptable speed and if I can't get threading to play nice, then I'll move on. But I tried one threading approach after looking at one threading example and an SO question on invoking a thread with parameters and wound up with code similar to the following:
Public Sub ParseString(ByVal SomeStr As String)
If String.IsNullOrWhitespace(SomeStr) Then
Throw New ArgumentNullException("SomeStr")
End If
Dim MyDelimiter As Char = "|"c
Dim SomeStrArr As String() = SomeStr.Split({MyDelimiter}, 2)
Dim FirstThread As New Thread(Sub() Me.ParseFirstString(SomeStrArr(0))
Dim SecondThread As New Thread(Sub() Me.ParseSecondString(SomeStrArr(1))
FirstThread.Priority = ThreadPriority.Highest
SecondThread.Priority = ThreadPriority.Highest
Call FirstThread.Start()
Call SecondThread.Start()
If Me.ParsedFirstString = False Then
Throw New ArgumentException("Failed to parse the first part of the string.")
End If
If Me.ParsedSecondString = False Then
Throw New ArgumentException("Failed to parse the second part of the string.")
End IF
End Sub
The problem with this is parsing of either the first or second parts of the string can complete before both are done, which trips up one of the two exceptions. So I looked around further and found that I could use the Join method to wait for both threads to complete. This solves the tripping up of the exceptions, but it drastically increases the execution time. Executing the above function 1,000 times and timing it now yields an average runtime of up to ~3,700ms. It almost seems like threading is just not suitable for this kind of task.
But it appears that there are other mechanisms for threading, including ThreadPools and BackgroundWorkers. Probably others I haven't looked up yet (I just started messing with this a few hours ago).
What is the community's opinion on threading for this kind of task? What is wrong with my first attempt at threading?
FYI, I am not updating any UI components nor writing results out to any kind of storage medium.
Conclusion:
It appears my string parsing functions are a lot better than I thought. Having tried both the Parallel Class and the Task Class, if I test a 10,000 iteration loop, then single threaded, my test data comes out to about ~1,220ms-1,260ms. If I implement even the basic Parallel.Invoke() to split the parsing into two parallel threads, I pad that timing loop up to an additional ~300ms (likely due to the overhead of the anonymous delegate, but it seems that there is no way around this). This is on a Core2 Q9550 Yorkfield, not overclocked, 95W processor, for comparison.
The winning choice is to remain single-threaded for this specific area of code. Thanks to all who participated!
I suggest using TPL classes like Parallel and Task.
Whether your code benefits from parallel execution or not, you need to benchmark on particular machine and find out. This is the best approach to take. The same code can slow down execution on one machine, but expedite a lot on another. Basically depends on CPU (number of cores, hyper-threading, etc.), algorithm and number of parallel tasks.
If you use TPL your code would look as simple as:
Call Parallel.Invoke(
Sub()
Me.ParseFirstString(SomeStrArr(0))
End Sub,
Sub()
Me.ParseFirstString(SomeStrArr(1))
End Sub)
I'm sorry, I'm not good at VB.NET syntax. There might be a way to make it shorter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With