PyPDF2 returning blank PDF after copy

Tags:

def EncryptPDFFiles(password, directory):
    pdfFiles = []
    success = 0

    # Get all PDF files from a directory
    for folderName, subFolders, fileNames in os.walk(directory):
        for fileName in fileNames:
            if (fileName.endswith(".pdf")):
                pdfFiles.append(os.path.join(folderName, fileName))
    print("%s PDF documents found." % str(len(pdfFiles)))

    # Create an encrypted version for each document
    for pdf in pdfFiles:
        # Copy old PDF into a new PDF object
        pdfFile = open(pdf,"rb")
        pdfReader = PyPDF2.PdfFileReader(pdfFile)
        pdfWriter = PyPDF2.PdfFileWriter()
        for pageNum in range(pdfReader.numPages):
            pdfWriter.addPage(pdfReader.getPage(pageNum))
        pdfFile.close()

        # Encrypt the new PDF and save it
        saveName = pdf.replace(".pdf",ENCRYPTION_TAG)
        pdfWriter.encrypt(password)
        newFile = open(saveName, "wb")
        pdfWriter.write(newFile)
        newFile.close()
        print("%s saved to: %s" % (pdf, saveName))


        # Verify the the encrypted PDF encrypted properly
        encryptedPdfFile = open(saveName,"rb")
        encryptedPdfReader = PyPDF2.PdfFileReader(encryptedPdfFile)
        canDecrypt = encryptedPdfReader.decrypt(password)
        encryptedPdfFile.close()
        if (canDecrypt):
            print("%s successfully encrypted." % (pdf))
            send2trash.send2trash(pdf)
            success += 1

    print("%s of %s successfully encrypted." % (str(success),str(len(pdfFiles))))

I am following along with Pythons Automate the Boring Stuff section. I've had off and on issues when doing the copy for a PDF document but as of right now everytime I run the program my copied PDF is all blank pages. There are the correct amount of pages of my newly encrypted PDF but they are all blank (no content on the pages). I've had this happen before but was not able to recreate. I've tried throwing in a sleep before closing my files. I'm not sure what the best practice for opening and closing files are in Python. For reference I'm using Python3.

282

asked Jun 05 '17 18:06

stryker14

2 Answers

Try moving the pdfFile.close to the very end of your for loop.

Click to copy

for pdf in pdfFiles:
    #
    # {stuff}
    #
    if (canDecrypt):
        print("%s successfully encrypted." % (pdf))
        send2trash.send2trash(pdf)
        success += 1

    pdfFile.close()

The thought is that the pdfFile needs to be available and open when the pdfWriter finally writes out, otherwise it cannot access the pages to write the new file.

195

answered Oct 09 '22 10:10

James C. Taylor

The issue with getting a blank page even after adding a page to your pdf with writer.addPage(your_page_name) is the context manager. You have to make sure that you're not closing the pdf from which you're reading the page.

For Example:

Click to copy

with open(str(_pdf), "rb") as in_f:
    reader = PdfFileReader(in_f)
    _page = reader.getPage(0)
    writer = PdfFileWriter()
    writer.addPage(_page)

with open(_filename, "wb+") as out_f:
    writer.write(out_f)

This will NOT WORK since the file handle is being closed by the context manager. The file has to be open So we would have to indent it. Like the following:

Click to copy

with open(str(_pdf), "rb") as in_f:
    reader = PdfFileReader(in_f)
    _page = reader.getPage(0)
    writer = PdfFileWriter()
    writer.addPage(_page)

    with open(_filename, "wb+") as out_f:
        writer.write(out_f)

I know it's not a big deal but this literally made me pull out my hair, indentation wasted my 6 hours. That's why I thought I should write an answer for others

answered Oct 09 '22 09:10

Mujeeb Ishaque

Related questions
                            
                                Pandas - merging dataframes conditionally on multiple columns
                            
                                Other option for colored scrollbar in tkinter based program?
                            
                                Using collections.Counter to count emojis with different colors
                            
                                PySpark sampleBy using multiple columns
                            
                                How do I just keep the rows with the maximum value in a column for items of the same type? [duplicate]
                            
                                python Get the unique values from a dictionary
                            
                                Create a dataframe from a dict where values are variable-length lists
                            
                                Are there more than three types of methods in Python?
                            
                                How to define default argument value based on previous arguments?
                            
                                Pandas reading NULL as a NaN float instead of str [duplicate]
                            
                                How can I upload a 'file' to S3 by creating a temp file, using AWS Lambda?
                            
                                How to get all combination from multiple lists?
                            
                                __call__ method of type class
                            
                                Scipy.optimize.curve_fit won't fit cosine power law
                            
                                Parsing a .proto file without creating the descriptor
                            
                                Save panda boxplot as image
                            
                                Pandas: Groupby to create table with count and count values
                            
                                How to create a title that will not appear in the toctree with Sphinx?
                            
                                simpler recursive code runs slower than iterative version of the same thing
                            
                                Population must be a sequence or set. For dicts, use list(d)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PyPDF2 returning blank PDF after copy

Tags:

python

python-3.x

pypdf2

stryker14

People also ask

2 Answers

James C. Taylor

Mujeeb Ishaque

Recent Activity

Donate For Us