Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python: pdf - set password protected print, copy, paste options?

I'm looking for a Python library that allows me to set password protected print, copy, paste options on existing PDF files.

What I tried

I looked at the reportlab pdfencrypt module: this has exactly the options I need, but the open source version is heavily restricted - cant even set a real password, and the license is not an option (over £1000/year) - this will be relatively low volume (< 1000 docs processed per year), and the client is a non-profit organisation

like image 664
zack Avatar asked Feb 18 '23 16:02

zack


2 Answers

You (just like me) must have thought - WTF is "-3904" from Kevin's answer.

Please get yourself comfortable - I have the answer).

I have found it in the PDF 1.6 reference. You can get it here: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdf_reference_archive/PDFReference16.pdf

The 3.5 section, page number 99:

32-bit integer containing a set of flags specifying which access permissions should be granted when the document is opened with user access. Table 3.20 shows the meanings of these flags. Bit posi­tions within the flag word are numbered from 1 (low-order) to 32 (high-order). A 1 bit in any position enables the corresponding access permission. Which bits are meaningful, and in some cases how they are interpreted, depends on the security handler’s revision number (specified in the encryption dictionary’s R entry).

*Note: PDF integer objects are represented internally in signed twos-complement form. Since all the reserved high-order flag bits in the encryption dictionary’s P val­ ue are required to be 1, the value must be specified as a negative integer. For exam­ ple, assuming revision 2 of the security handler, the value -44 permits printing and copying but disallows modifying the contents and annotations.

So, P is permission! Please check the table in that document. -44 is 11010100 in bits representation.

I have made it like this(printing and copying are permitted but modifying the contents and annotations are disallowed):

from hashlib import md5

from PyPDF4 import PdfFileReader, PdfFileWriter
from PyPDF4.generic import NameObject, DictionaryObject, ArrayObject, \
    NumberObject, ByteStringObject
from PyPDF4.pdf import _alg33, _alg34, _alg35
from PyPDF4.utils import b_


def encrypt(writer_obj: PdfFileWriter, user_pwd, owner_pwd=None, use_128bit=True):
    """
    Encrypt this PDF file with the PDF Standard encryption handler.

    :param str user_pwd: The "user password", which allows for opening
        and reading the PDF file with the restrictions provided.
    :param str owner_pwd: The "owner password", which allows for
        opening the PDF files without any restrictions.  By default,
        the owner password is the same as the user password.
    :param bool use_128bit: flag as to whether to use 128bit
        encryption.  When false, 40bit encryption will be used.  By default,
        this flag is on.
    """
    import time, random
    if owner_pwd == None:
        owner_pwd = user_pwd
    if use_128bit:
        V = 2
        rev = 3
        keylen = int(128 / 8)
    else:
        V = 1
        rev = 2
        keylen = int(40 / 8)
    # permit copy and printing only:
    P = -44
    O = ByteStringObject(_alg33(owner_pwd, user_pwd, rev, keylen))
    ID_1 = ByteStringObject(md5(b_(repr(time.time()))).digest())
    ID_2 = ByteStringObject(md5(b_(repr(random.random()))).digest())
    writer_obj._ID = ArrayObject((ID_1, ID_2))
    if rev == 2:
        U, key = _alg34(user_pwd, O, P, ID_1)
    else:
        assert rev == 3
        U, key = _alg35(user_pwd, rev, keylen, O, P, ID_1, False)
    encrypt = DictionaryObject()
    encrypt[NameObject("/Filter")] = NameObject("/Standard")
    encrypt[NameObject("/V")] = NumberObject(V)
    if V == 2:
        encrypt[NameObject("/Length")] = NumberObject(keylen * 8)
    encrypt[NameObject("/R")] = NumberObject(rev)
    encrypt[NameObject("/O")] = ByteStringObject(O)
    encrypt[NameObject("/U")] = ByteStringObject(U)
    encrypt[NameObject("/P")] = NumberObject(P)
    writer_obj._encrypt = writer_obj._addObject(encrypt)
    writer_obj._encrypt_key = key


unmeta = PdfFileReader('my_pdf.pdf')

writer = PdfFileWriter()
writer.appendPagesFromReader(unmeta)
encrypt(writer, '1', '123')

with open('my_pdf_encrypted.pdf', 'wb') as fp:
    writer.write(fp)

Please vote, if you liked my answer ;).

like image 184
Tom Jones Avatar answered Mar 05 '23 22:03

Tom Jones


Laaate answer, but I wanted to start contributing...

in the pdf.py file, in the encypt() method, change the flag:

    # permit everything:
    P = -1

to:

    # prevent everything:
    P = -3904

This will prevent all features beyond simple viewing. Be sure to pass in a different owner password.

like image 36
Kevin Avatar answered Mar 05 '23 22:03

Kevin