Add text to Existing PDF using Python

Example for [Python 2.7]:

from pyPdf import PdfFileWriter, PdfFileReader
import StringIO
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

packet = StringIO.StringIO()
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(10, 100, "Hello world")
can.save()

#move to the beginning of the StringIO buffer
packet.seek(0)

# create a new PDF with Reportlab
new_pdf = PdfFileReader(packet)
# read your existing PDF
existing_pdf = PdfFileReader(file("original.pdf", "rb"))
output = PdfFileWriter()
# add the "watermark" (which is the new pdf) on the existing page
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# finally, write "output" to a real file
outputStream = file("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()

Example for Python 3.x:

from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

packet = io.BytesIO()
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(10, 100, "Hello world")
can.save()

#move to the beginning of the StringIO buffer
packet.seek(0)

# create a new PDF with Reportlab
new_pdf = PdfFileReader(packet)
# read your existing PDF
existing_pdf = PdfFileReader(open("original.pdf", "rb"))
output = PdfFileWriter()
# add the "watermark" (which is the new pdf) on the existing page
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# finally, write "output" to a real file
outputStream = open("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()

I know this is an older post, but I spent a long time trying to find a solution. I came across a decent one using only ReportLab and PyPDF so I thought I'd share:

read your PDF using PdfFileReader(), we'll call this input
create a new pdf containing your text to add using ReportLab, save this as a string object
read the string object using PdfFileReader(), we'll call this text
create a new PDF object using PdfFileWriter(), we'll call this output
iterate through input and apply .mergePage(*text*.getPage(0)) for each page you want the text added to, then use output.addPage() to add the modified pages to a new document

This works well for simple text additions. See PyPDF's sample for watermarking a document.

Here is some code to answer the question below:

packet = StringIO.StringIO()
can = canvas.Canvas(packet, pagesize=letter)
<do something with canvas>
can.save()
packet.seek(0)
input = PdfFileReader(packet)

From here you can merge the pages of the input file with another document.

pdfrw will let you read in pages from an existing PDF and draw them to a reportlab canvas (similar to drawing an image). There are examples for this in the pdfrw examples/rl1 subdirectory on github. Disclaimer: I am the pdfrw author.

Leveraging David Dehghan's answer above, the following works in Python 2.7.13:

from PyPDF2 import PdfFileWriter, PdfFileReader, PdfFileMerger

import StringIO

from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

packet = StringIO.StringIO()
# create a new PDF with Reportlab
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(290, 720, "Hello world")
can.save()

#move to the beginning of the StringIO buffer
packet.seek(0)
new_pdf = PdfFileReader(packet)
# read your existing PDF
existing_pdf = PdfFileReader("original.pdf")
output = PdfFileWriter()
# add the "watermark" (which is the new pdf) on the existing page
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# finally, write "output" to a real file
outputStream = open("destination.pdf", "wb")
output.write(outputStream)
outputStream.close()

cpdf will do the job from the command-line. It isn't python, though (afaik):

cpdf -add-text "Line of text" input.pdf -o output .pdf

Related questions
                            
                                Is there an expression for an infinite iterator?
                            
                                How to run code when a class is subclassed? [duplicate]
                            
                                What is the difference between .py and .pyc files? [duplicate]
                            
                                Why do you have to call .items() when iterating over a dictionary in Python?
                            
                                Generate temporary file names without creating actual file in Python
                            
                                Are for-loops in pandas really bad? When should I care?
                            
                                Equivalent C++ to Python generator pattern
                            
                                How to set a cell to NaN in a pandas dataframe
                            
                                Learning Python from Ruby; Differences and Similarities
                            
                                Displaying better error message than "No JSON object could be decoded"
                            
                                How to create major and minor gridlines with different linestyles in Python
                            
                                What exactly is Python multiprocessing Module's .join() Method Doing?
                            
                                Iterate over the lines of a string
                            
                                Combining node.js and Python
                            
                                Difference between len() and .__len__()?
                            
                                How to save a list as numpy array in python?
                            
                                In-memory size of a Python structure
                            
                                How do I disable a test using pytest?
                            
                                are there dictionaries in javascript like python?
                            
                                Find the max of two or more columns with pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Add text to Existing PDF using Python

Tags:

python

pdf

People also ask

Example for [Python 2.7]:

Example for Python 3.x:

Recent Activity

Donate For Us