Im using the ImageMagick.NET library for C# and I want to get some informations from each page in a .PDF document. Here is my current code:
var list = new MagickImageCollection();
list.Read(file.FullName);
foreach (var page in list)
{
if (!backgroundWorker.CancellationPending)
{
pageCount.pageColorspace(page);
isFormat(page.Width, page.Height);
pageCount.incPdfPages();
}
}
But in my opinon the performance is really slow. It takes 4 minutes for 10 PDF - files with 703 pages. Is the a way to get it faster?
You can improve the performance by reading the file page by page. If you read the whole file there will be 703 pages in memory. Your machine can probably not allocate that much memory and ImageMagick will then use the disk to store the pixels and that will reduce the performance.
You can specify the page you want to read with the FrameIndex property of the MagickReadSettings class. If you specify a page that is too high an Exception will be raised (requires Magick.NET 7.0.0.0005 or higher) with a message that says you are requesting an invalid page. You need to do this because ImageMagick does not know the page count of a PDF file. Below is an example of how you could do this.
int page = 0;
while (true)
{
MagickReadSettings settings = new MagickReadSettings()
{
FrameIndex = page
};
try
{
using (MagickImage image = new MagickImage(@"C:\YourFile.pdf", settings))
{
// Do something with the image....
}
}
catch (MagickException ex)
{
if (ex.Message.Contains("Requested FirstPage is greater"))
break;
else
throw;
}
page++;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With