Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract images from PDF using Ghostscript or ImageMagick?

I need to render or fetch all the images from a specific PDF file. How can I achieve this using Ghostscript or ImageMagick ?

like image 631
mmoghrabi Avatar asked Jun 12 '13 12:06

mmoghrabi


People also ask

Does ImageMagick use Ghostscript?

The ghostscript interpreter is used by ImageMagick and GraphicsMagick to convert Postscript and similar formats into images.

Can ImageMagick convert PDF?

Use ImageMagick® to create, edit, compose, or convert digital images. It can read and write images in a variety of formats (over 200) including PNG, JPEG, GIF, WebP, HEIC, SVG, PDF, DPX, EXR and TIFF.


1 Answers

You cannot do it with Ghostscript, but you can do it with Poppler's or XPDF's commandline tools named pdfimages:

pdfimages -j some.pdf subdir/image-prefix

All the images will now be located in subdir/ named image-prefix-0001.jpg, image-prefix-0002.jpg ...

The -j parameter will make the command try to directly extract JPEGs. Failing to create JPEGs, it will create PNMs or PPMs, which you can always convert using ImageMagick:

convert subdir/image-prefix-0033.ppm subdir/image-prefix-0033.jpeg
like image 99
Kurt Pfeifle Avatar answered Oct 12 '22 13:10

Kurt Pfeifle