Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a PDF has any kind of digital signature

I need to understand if a PDF has any kind of digital signature. I have to manage huge PDFs, e.g. 500MB each, so I just need to find a way to separate non-signed from signed (so I can send just signed PDFs to a method that manages them). Any procedure found until now involves attempt to extract certificate via e.g. Bouncycastle libs (in my case, for Java): if it is present, pdf is signed, if it not present or a exception is raised, is it not (sic!). But this is obviously time/memory consuming, other than an example of resource-wastings implementation.

Is there any quick language-independent way, e.g. opening PDF file, and reading first bytes and finding an info telling that file is signed? Alternatively, is there any reference manual telling in detail how is made internally a PDF?

Thank you in advance

like image 542
Sampisa Avatar asked Dec 23 '16 12:12

Sampisa


2 Answers

Using command line you can check if a file has a digital signature with pdfsig tool from poppler-utils package (works on Ubuntu 20.04).

pdfsig pdffile.pdf

will produce output with detailed data on the signatures included and validation data. If you need to scan a pdf file tree and get a list of signed pdfs you can use a bash command like:

find ./path/to/files -iname '*.pdf'  \
-exec bash -c 'pdfsig "$0";  \
if [[ $? -eq 0 ]]; then  \
echo "$0" >> signed-files.txt; fi' {} \;

You will get a list of signed files in signed-files.txt file in the local directory.

I have found this to be much more reliable than trying to grep some text out of a pdf file (for example, the pdfs produced by signing services in Lithuania do not contain the string "SigFlags" which was mentioned in the previous answers).

like image 171
dgvirtual Avatar answered Sep 24 '22 00:09

dgvirtual


You are going to want to use a PDF Library rather than trying to implement this all yourself, otherwise you will get bogged down with handling the variations of Linearized documents, Filters, Incremental updates, object streams, cross-reference streams, and more.

With regards to reference material; per my cursory search, it looks like Adobe is no longer providing its version of the ISO 32000:2008 specification to any and all, though that specification is mainly a translation of the PDF v1.7 Reference manual to ISO-conforming language.

So assuming the PDF v1.7 Reference, the most relevant sections are going to be 8.7 (Digital Signatures), 3.6.1 (Document Catalog), and 8.6 (Interactive Forms).

The basic process is going to be:

  1. Read the Document Catalog for 'Perms' and 'AcroForm' entries.
  2. Read the 'Perms' dictionary for 'DocMDP','UR', or 'UR3' entries. If these entries exist, In all likelyhood, you have either a certified document or a Reader-enabled document.
  3. Read the 'AcroForm' entry; (make sure that you do not have an 'XFA' entry, because in the words of Fraizer from Porgy and Bess: Dat's a complication!). You basically want to first check if there is an (optional) 'SigFlags' entry, in which case a non-zero value would indicate that there is a signature in the Fields Array. Otherwise, you need to walk each entry of the 'Fields' Array looking for a field dictionary with an 'FT' (Field Type) entry set to 'Sig' (signature), with a 'V' (Value) entry that is not null.

Using a PDF library that can use the document's cross-reference table to navigate you to the right indirect objects should be faster and less resource-intensive than a brute-force search of the document for a certificate.

like image 43
Patrick Gallot Avatar answered Sep 26 '22 00:09

Patrick Gallot