Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting a PDF to a series of images with Python

I'm attempting to use Python to convert a multi-page PDF into a series of JPEGs. I can split the PDF up into individual pages easily enough with available tools, but I haven't been able to find anything that can covert PDFs to images.

PIL does not work, as it can't read PDFs. The two options I've found are using either GhostScript or ImageMagick through the shell. This is not a viable option for me, since this program needs to be cross-platform, and I can't be sure either of those programs will be available on the machines it will be installed and used on.

Are there any Python libraries out there that can do this?

like image 674
Jaearess Avatar asked Dec 01 '08 19:12

Jaearess


People also ask

How do I save a PDF as a series of pictures?

Open your PDF in Adobe Acrobat Pro and choose file. Export it to the new file format by going to the right pane and choosing “Export PDF” tool. Or, go to the menu and select “File” > “Export to” > “Image.” Choose image format type (e.g., JPG file, TIFF, etc.).

How do I split a PDF into multiple files in Python?

Just replace from pyPdf import ... with from PyPDF2 import ... . User with open("document-page%s. pdf" % (i+1), "wb") as outputStream: if you want your files to be named with index starting from 1 instead of 0. If i want to split 100 instead of split 1 page individual i want to save 2 in 1 pdf.


1 Answers

ImageMagick has Python bindings.

like image 59
Adam Rosenfield Avatar answered Oct 14 '22 10:10

Adam Rosenfield