Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open a protected pdf file in python

I write a pdf cracking and found the password of the protected pdf file. I want to write a program in Python that can display that pdf file on the screen without password.I use the PyPDF library. I know how to open a file without the password, but can't figure out the protected one.Any idea? Thanks

filePath = raw_input()
password = 'abc'
if sys.platform.startswith('linux'):
       subprocess.call(["xdg-open", filePath])
like image 962
KL84 Avatar asked Sep 30 '14 21:09

KL84


People also ask

How do I open an encrypted PDF in Python?

Decrypt with qpdf We can download its installer for Windows from SourceForge, or install it for Mac with brew install qpdf command. Sample code that qpdf decrypts a PDF file is below. The point is that Python executes the qpdf command as the OS command and save decrypted PDF file as new PDF file without password.


2 Answers

The approach shown by KL84 basically works, but the code is not correct (it writes the output file for each page). A cleaned up version is here:

https://gist.github.com/bzamecnik/1abb64affb21322256f1c4ebbb59a364

# Decrypt password-protected PDF in Python.
# 
# Requirements:
# pip install PyPDF2

from PyPDF2 import PdfFileReader, PdfFileWriter

def decrypt_pdf(input_path, output_path, password):
  with open(input_path, 'rb') as input_file, \
    open(output_path, 'wb') as output_file:
    reader = PdfFileReader(input_file)
    reader.decrypt(password)

    writer = PdfFileWriter()

    for i in range(reader.getNumPages()):
      writer.addPage(reader.getPage(i))

    writer.write(output_file)

if __name__ == '__main__':
  # example usage:
  decrypt_pdf('encrypted.pdf', 'decrypted.pdf', 'secret_password')
like image 116
Bohumir Zamecnik Avatar answered Sep 30 '22 18:09

Bohumir Zamecnik


You should use pikepdf library nowadays instead:

import pikepdf

with pikepdf.open("input.pdf", password="abc") as pdf:
    num_pages = len(pdf.pages)
    print("Total pages:", num_pages)

PyPDF2 doesn't support many encryption algorithms, pikepdf seems to solve them, it supports most of password protected methods, and also documented and actively maintained.

like image 36
rockikz Avatar answered Sep 30 '22 17:09

rockikz