Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle multi-page images in PythonMagick?

I want to convert some multi-pages .tif or .pdf files to individual .png images. From command line (using ImageMagick) I just do:

convert multi_page.pdf file_out.png

And I get all the pages as individual images (file_out-0.png, file_out-1.png, ...)

I would like to handle this file conversion within Python, unfortunately PIL cannot read .pdf files, so I want to use PythonMagick. I tried:

import PythonMagick
im = PythonMagick.Image('multi_page.pdf')
im.write("file_out%d.png")

or just

im.write("file_out.png")

But I only get 1 page converted to png. Of course I could load each pages individually and convert them one by one. But there must be a way to do them all at once?

like image 479
Tickon Avatar asked May 07 '12 22:05

Tickon


3 Answers

ImageMagick is not memory efficient, so if you try to read a large pdf, like 100 pages or so, the memory requirement will be huge and it might crash or seriously slow down your system. So after all reading all pages at once with PythonMagick is a bad idea, its not safe. So for pdfs, I ended up doing it page by page, but for that I need to get the number of pages first using pyPdf, its reasonably fast:

pdf_im = pyPdf.PdfFileReader(file('multi_page.pdf', "rb"))
npage = pdf_im.getNumPages()
for p in npage:
    im = PythonMagick.Image('multi_page.pdf['+ str(p) +']')
    im.write('file_out-' + str(p)+ '.png')
like image 76
Tickon Avatar answered Oct 23 '22 13:10

Tickon


A more complete example based on the answer by Ivo Flipse and http://p-s.co.nz/wordpress/pdf-to-png-using-pythonmagick/

This uses a higher resolution and uses PyPDF2 instead of older pyPDF.

import sys
import PyPDF2
import PythonMagick

pdffilename = sys.argv[1] 
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
    im = PythonMagick.Image()
    im.density('300')
    im.read(pdffilename + '[' + str(p) +']')
    im.write('file_out-' + str(p)+ '.png')
like image 30
Daniël van Eeden Avatar answered Oct 23 '22 13:10

Daniël van Eeden


I had the same problem and as a work around i used ImageMagick and did

import subprocess
params = ['convert', 'src.pdf', 'out.png']
subprocess.check_call(params)
like image 1
Rakesh Avatar answered Oct 23 '22 13:10

Rakesh