How can I verify that a PDF file is "good"?

Question

I have a process that compresses PDF files that our secretaries create by scanning signed documents at a multi-function printer.

On rare occasions, these files cannot be opened in Acrobat reader after being compressed. I don't know why this is happening rarely, so I'd like to be able to test the PDF post-compression and see if it is "good".

I am trying to use itextsharp 5.1.1 to accomplish this, but it happily loads the PDF. My best guess is that Acrobat reader fails when it's trying to display the picture.

Any ideas on how I can tell if the PDF will render?

ewall · Accepted Answer

In similar situations in the past I have successfully used the PDF Toolkit (a/k/a pdftk) to repair bad PDFs with a command like this: pdftk broken.pdf output fixed.pdf.

Kevin Buchan · Answer

OK, what I ended up doing was using itextsharp to loop through all of the stream objects and check their length. The error condition I had was that the length would be zero. This test seems quite reliable. It may not work for everyone, but it worked in this particular situation.

Zombo · Answer

PdfCpu works great. relaxed example:

pdfcpu validate goggles.pdf

Strict example:

pdfcpu validate -m strict goggles.pdf

https://pdfcpu.io/core/validate

Oli · Answer

qpdf will be of great help for your needs:

apt-get install qpdf

qpdf --check filename.pdf

example output:

checking filename.pdf
PDF Version: 1.4
File is not encrypted
File is not linearized
WARNING: filename.pdf: file is damaged
WARNING: filename.pdf (object 185 0, file position 1235875): expected n n obj
WARNING: filename.pdf: Attempting to reconstruct cross-reference table
WARNING: filename.pdf: object 185 0 not found in file after regenerating cross reference table
operation for Dictionary object attempted on object of wrong type

How can I verify that a PDF file is "good"?

Tags:

powershell

itextsharp

Kevin Buchan

4 Answers

ewall

Kevin Buchan

Zombo

Oli

Recent Activity

Donate For Us

How can I verify that a PDF file is "good"?

Tags:

powershell

itextsharp

Kevin Buchan

4 Answers

ewall

Kevin Buchan

Zombo

Oli

Related questions

Recent Activity

Donate For Us