I started messing around with multithreading for a CPU intensive batch process I'm running. Essentially I'm trying to condense multiple single page tiffs into single PDF documents. This works fine with a foreach loop or standard iteration but can be very slow for several 100 page documents. I tried the following based on a some examples I found to use multithreading and it has significant performance improvements however it obliterates the page order instead of 1,2,3,4 it will be 1,3,4,2,6,5 on what thread completes first.
My question is how would I utilize this technique while maintaining the page order and if I can will it negate the performance benefit of the multithreading? Thank you in advance.
PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
int counter = split.Count();
// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);
double[] results = new double[counter];
// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
// Loop over each range element without a delegate invocation.
for (int i = range.Item1; i < range.Item2; i++)
{
f_prime = split[i].Replace(" " , "");
PdfPage page = doc.AddPage();
XGraphics gfx = XGraphics.FromPdfPage(page);
XImage image = XImage.FromFile(f_prime);
double x = 0;
gfx.DrawImage(image, x, 0);
}
});
I would just use the overload of Parallel.ForEach that returns the element index:
Parallel.ForEach(rangePartitioner, (range, loopState, elementIndex) =>
then in your loop you can fill an array with the result of your work and go through the results in order once they have all completed.
I'm not sure the other solutions will work exactly the way he wants. The reasoning for this is that PdfPage page = doc.AddPage();
creates and adds a new page at the same time, thus it will always be out of order since the order is dictated first come first serve through doc
If AddPage
is a fast operation, you can create all 100 pages at once, without any processing. Then go back through and render the Tiff images into the page.
PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
int counter = split.Count();
// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);
double[] results = new double[counter];
PdfPage[] pages = new PdfPage[counter];
for (int i = 0; i < counter; ++i)
{
pages[i] = doc.AddPage();
}
// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
// Loop over each range element without a delegate invocation.
for (int i = range.Item1; i < range.Item2; i++)
{
f_prime = split[i].Replace(" " , "");
PdfPage page = pages[i];
XGraphics gfx = XGraphics.FromPdfPage(page);
XImage image = XImage.FromFile(f_prime);
double x = 0;
gfx.DrawImage(image, x, 0);
}
});
Edit
I think there is a more elegant solution but without knowing the Properties of PdfPage I didn't want to offer it before. If you can tell which page a PfdPage belongs to you can make things very simple like so:
PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
int counter = split.Count();
// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);
double[] results = new double[counter];
// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
// Loop over each range element without a delegate invocation.
for (int i = range.Item1; i < range.Item2; i++)
{
PdfPage page = doc.AddPage();
// Only use i as a loop not as the index
int pageIndex = page.PageIndex; // This is what I don't know
f_prime = split[pageIndex].Replace(" " , "");
XGraphics gfx = XGraphics.FromPdfPage(page);
XImage image = XImage.FromFile(f_prime);
double x = 0;
gfx.DrawImage(image, x, 0);
}
});
Use .AsParallel().AsOrdered(), as described in this document: http://msdn.microsoft.com/en-us/library/dd460677.aspx
I think it would look something like this:
rangePartitioner.AsParallel().AsOrdered().ForAll(
range =>
{
// Loop over each range element without a delegate invocation.
...
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With