I'd like to turn a multipage PDF document into a series of image object in list structure, without saving the images in disk (I'd like to process them with PIL Image)in Python. So far I can only do this to write the images into files first:
from wand.image import Image
with Image(filename='source.pdf') as img:
with img.convert('png') as converted:
converted.save(filename='pyout/page.png')
But how could I turn the img objects above directly into list of PIL.Image objects?
pip install pdf2image
from pdf2image import convert_from_path, convert_from_bytes
images = convert_from_path('/path/to/my.pdf')
You may need to install pillow as well. This might only work on linux.
https://github.com/Belval/pdf2image
Results may be different between the two methods.
Python 3.4:
from PIL import Image
from wand.image import Image as wimage
import os
import io
if __name__ == "__main__":
filepath = "fill this in"
assert os.path.exists(filepath)
page_images = []
with wimage(filename=filepath, resolution=200) as img:
for page_wand_image_seq in img.sequence:
page_wand_image = wimage(page_wand_image_seq)
page_jpeg_bytes = page_wand_image.make_blob(format="jpeg")
page_jpeg_data = io.BytesIO(page_jpeg_bytes)
page_image = Image.open(page_jpeg_data)
page_images.append(page_image)
Lastly, you can make a system call to mogrify, but that can be more complicated as you need to manage temporary files.
Simple way is to save image files and delete them after reading them using PIL.
I recommend to use pdf2image package. Before using pdf2image package, you might need to install poppler package via anaconda
conda install -c conda-forge poppler
If you are stuck, please update conda before installing :
conda update conda
conda update anaconda
After installing poppler, install pdf2image via pip :
pip install pdf2image
Then run this code :
from pdf2image import convert_from_path
dpi = 500 # dots per inch
pdf_file = 'work.pdf'
pages = convert_from_path(pdf_file ,dpi )
for i in range(len(pages)):
page = pages[i]
page.save('output_{}.jpg'.format(i), 'JPEG')
After this, please read them using PIL and delete them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With