Merge PDF files

1 Answers

You can use PyPdf2s PdfMerger class.

File Concatenation

You can simply concatenate files by using the append method.

from PyPDF2 import PdfFileMerger  pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf']  merger = PdfFileMerger()  for pdf in pdfs:     merger.append(pdf)  merger.write("result.pdf") merger.close()

You can pass file handles instead file paths if you want.

File Merging

If you want more fine grained control of merging there is a merge method of the PdfMerger, which allows you to specify an insertion point in the output file, meaning you can insert the pages anywhere in the file. The append method can be thought of as a merge where the insertion point is the end of the file.

e.g.

Click to copy

merger.merge(2, pdf)

Here we insert the whole pdf into the output but at page 2.

Page Ranges

If you wish to control which pages are appended from a particular file, you can use the pages keyword argument of append and merge, passing a tuple in the form (start, stop[, step]) (like the regular range function).

e.g.

Click to copy

merger.append(pdf, pages=(0, 3))    # first 3 pages merger.append(pdf, pages=(0, 6, 2)) # pages 1,3, 5

If you specify an invalid range you will get an IndexError.

Note: also that to avoid files being left open, the PdfFileMergers close method should be called when the merged file has been written. This ensures all files are closed (input and output) in a timely manner. It's a shame that PdfFileMerger isn't implemented as a context manager, so we can use the with keyword, avoid the explicit close call and get some easy exception safety.

You might also want to look at the pdfcat script provided as part of pypdf2. You can potentially avoid the need to write code altogether.

The PyPdf2 github also includes some example code demonstrating merging.

PyMuPdf

Another library perhaps worth a look is PyMuPdf which seems to be actively maintained. Merging is equally simple

From command line:

Click to copy

python -m fitz join -o result.pdf file1.pdf file2.pdf file3.pdf

and from code

Click to copy

import fitz  result = fitz.open()  for pdf in ['file1.pdf', 'file2.pdf', 'file3.pdf']:     with fitz.open(pdf) as mfile:         result.insertPDF(mfile)      result.save("result.pdf")

With plenty of options, detailed in the projects wiki.

answered Oct 15 '22 11:10

Paul Rooney

Related questions
                            
                                Difference between two dates in Python
                            
                                Importing variables from another file?
                            
                                How can I disable logging while running unit tests in Python Django?
                            
                                Multiple levels of 'collection.defaultdict' in Python
                            
                                Setting Django up to use MySQL
                            
                                How can I plot separate Pandas DataFrames as subplots?
                            
                                How to pull a random record using Django's ORM?
                            
                                How to find all positions of the maximum value in a list?
                            
                                How to get the seconds since epoch from the time + date output of gmtime()?
                            
                                Step-by-step debugging with IPython
                            
                                What is a good practice to check if an environmental variable exists or not?
                            
                                It is more efficient to use if-return-return or if-else-return?
                            
                                matplotlib does not show my drawings although I call pyplot.show()
                            
                                Multiprocessing: How to use Pool.map on a function defined in a class?
                            
                                How to sort mongodb with pymongo
                            
                                random.seed(): What does it do?
                            
                                What does %s mean in a python format string?
                            
                                Sound alarm when code finishes
                            
                                How do I turn a python datetime into a string, with readable format date?
                            
                                Python glob multiple filetypes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Merge PDF files

Tags:

python

file-io

pdf

pypdf2

pypdf

Btibert3

People also ask

1 Answers

Paul Rooney

Recent Activity

Donate For Us