I have a set of XML files that I want to load into memory in order to process.
I am loading the files into a Collection and it seems that it is a lot faster if I load the files in a single thread rather than using the thread pool.
I would have thought this would have been the other way around.
Why is it the case that use multiple threads to load files into memory is significantly slower than if I just iterate through the file list and load each file one after another on a single thread?
This is with C# .net 3.5
The code:
ICollection<XmlDocument> xmlFilesToProcess = new Collection<XmlDocument>();
foreach (FileInfo fileInfo in fileList)
{
ThreadPool.QueueUserWorkItem(
(o) =>
{
XmlDocument doc = new XmlDocument();
doc.Load((string)o);
lock (xmlFilesToProcess)
{
xmlFilesToProcess.Add(doc);
counter++;
}
}, fileInfo.FullName);
}
Without seeing the code, its hard to tell. If the size and/or number of XML is small and you only have one CPU then it could be simply that the context switching between threads is taking more time than is required to simply read the files.
EDIT
Now that I see the code you are creating way too many threads. I suggest you use the Parallel.For of the TPL. This is available for .Net 3.5
See http://msdn.microsoft.com/en-us/magazine/cc163340.aspx for more info on TPL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With