Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge two landscape pdf pages using pyPdf

I'm having trouble merging two PDF files with pyPdf. When I run the following code the the watermark (page1) looks fine, but the page2 has been rotated 90 degrees clockwise.

Any ideas what's going on?

Example of what's going wrong

from pyPdf import PdfFileWriter, PdfFileReader

# PDF1: A4 Landscape page created in photoshop using PdfCreator, 
input1 = PdfFileReader(file("base.pdf", "rb"))
page1 = input1.getPage(0)

# PDF2: A4 Landscape page, text only, created using Pisa (www.xhtml2pdf.com)
input2 = PdfFileReader(file("text.pdf", "rb"))
page2 = input2.getPage(0)

# Merge
page1.mergePage(page2)

# Output
output = PdfFileWriter()
output.addPage(page1)
outputStream = file("output.pdf", "wb")
output.write(outputStream)
outputStream.close()
like image 680
Humphrey Avatar asked May 18 '11 07:05

Humphrey


2 Answers

You can transform the page while you're merging it into another page. I defined this function to rotate the page around a point while being merged:

def mergeRotateAroundPointPage(page, page2, rotation, tx, ty):
    translation = [[1, 0, 0],
                   [0, 1, 0],
                   [-tx,-ty,1]]
    rotation = math.radians(rotation)
    rotating = [[math.cos(rotation), math.sin(rotation),0],
                [-math.sin(rotation),math.cos(rotation), 0],
                [0,                  0,                  1]]
    rtranslation = [[1, 0, 0],
                   [0, 1, 0],
                   [tx,ty,1]]
    ctm = utils.matrixMultiply(translation, rotating)
    ctm = utils.matrixMultiply(ctm, rtranslation)

    return page.mergeTransformedPage(page2, [ctm[0][0], ctm[0][1],
                                             ctm[1][0], ctm[1][1],
                                             ctm[2][0], ctm[2][1]])

Then you call it like this:

mergeRotateAroundPointPage(page1, page2, 
                page1.get('/Rotate') or 0, 
                page2.mediaBox.getWidth()/2, page2.mediaBox.getWidth()/2)
like image 159
speedplane Avatar answered Sep 27 '22 03:09

speedplane


I found a solution. My code was fine - I just had to change how I generated the original PDF files.

Instead of creating the PDF using PdfCreator & Photoshop, I copy and pasted my photoshop image into MS Word 2007, and then used it's export feature to create the PDF file for page1. It now works great!

So, PdfCreator must producing PDF files that are not compatible with pyPdf.

like image 26
Humphrey Avatar answered Sep 26 '22 03:09

Humphrey