Add text to existing PDF document in Python

~~I'm trying to convert a pdf to the same size as my pdf which is an A4 page.~~

convert my_pdf.pdf -density 300x300 -page A4 my_png.png

The resulting png file, however, is 595px × 842px which should be the resolution at 72 dpi. I was thinking of using PIL to write some text on some of the pdf fields and convert it back to PDF. But currently the image is coming out wrong.

Edit: I was approaching the problem from the wrong angle. The correct approach didn't include imagemagick at all.

How do you add text to a PDF in Python?

read your PDF using PdfFileReader() , we'll call this input. create a new pdf containing your text to add using ReportLab, save this as a string object. read the string object using PdfFileReader() , we'll call this text. create a new PDF object using PdfFileWriter() , we'll call this output.

How do I add text to an existing PDF?

Open the document in the PDF editor. Select Tools > Edit PDF > Add Text.

Can you edit a PDF with Python?

Open a PDF in Python. Insert content at the beginning of the PDF document. Call the 'save()' method, passing the name of the output file with the required extension. Get the edited result.

After searching around some I finally found the solution: It turns out that this was the correct approach after all. Yet, i feel that it wasn't verbose enough. It appears that the poster probably took it from here (same variable names etc).

The idea: create new blank PDF with Reportlab which only contains a text string. Then merge/add it as a watermark using pyPdf.

from pyPdf import PdfFileWriter, PdfFileReader
import StringIO
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
packet = StringIO.StringIO()
# create a new PDF with Reportlab
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(100,100, "Hello world")
can.save()

#move to the beginning of the StringIO buffer
packet.seek(0)
new_pdf = PdfFileReader(packet)
# read your existing PDF
existing_pdf = PdfFileReader(file("mypdf.pdf", "rb"))
output = PdfFileWriter()
# add the "watermark" (which is the new pdf) on the existing page
page = existing_pdf.getPage(0)
page.mergePage(new_pdf.getPage(0))
output.addPage(page)
# finally, write "output" to a real file
outputStream = file("/home/joe/newpdf.pdf", "wb")
output.write(outputStream)
outputStream.close()

Hope this helps somebody else.

I just tried the solution above, but I had quite some troubles to get it running in Python3. So, I would like to share my modifications. The adapted code looks as follows:

from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

packet = io.BytesIO()

# create a new PDF with Reportlab
can = canvas.Canvas(packet, pagesize=letter)
can.drawString(100, 100, "Hello world")
can.save()

# move to the beginning of the StringIO buffer
packet.seek(0)
new_pdf = PdfFileReader(packet)
# read your existing PDF
existing_pdf = PdfFileReader(open("mypdf.pdf", "rb"))
output = PdfFileWriter()
# add the "watermark" (which is the new pdf) on the existing page
page = existing_pdf.getPage(0)
page2 = new_pdf.getPage(0)
page.mergePage(page2)
output.addPage(page)
# finally, write "output" to a real file
outputStream = open("newpdf.pdf", "wb")
output.write(outputStream)
outputStream.close()

Now the page.mergePage throws an error. Turns out to be a porting error in pypdf2. Please refer to this question for the solution: Porting to Python3: PyPDF2 mergePage() gives TypeError

Add text to existing PDF document in Python

Tags:

python

pdf-generation

imagemagick

Uku Loskit

People also ask

2 Answers

Uku Loskit

Werner Trelawney

Recent Activity

Donate For Us

Add text to existing PDF document in Python

Tags:

python

pdf-generation

imagemagick

Uku Loskit

People also ask

2 Answers

Uku Loskit

Werner Trelawney

Related questions

Recent Activity

Donate For Us