Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert PDF to low-resolution (but good quality) JPEG?

When I use the following ghostscript command to generate jpg thumbnails from PDFs, the image quality is often very poor:

gs -q -dNOPAUSE -dBATCH -sDEVICE=jpeggray -g465x600 -dUseCropBox -dPDFFitPage -sOutputFile=pdf_to_lowres.jpg test.pdf

By contrast, if I use ghostscript to generate a high-resolution png, and then use mogrify to convert the high-res png to a low-res jpg, I get pretty good results.

gs -q -dNOPAUSE -dBATCH -sDEVICE=pnggray -g2550x3300 -dUseCropBox -dPDFFitPage -sOutputFile=pdf_to_highres.png test.pdf
mogrify -thumbnail 465x600 -format jpg -write pdf_to_highres_to_lowres.jpg pdf_to_highres.png

Is there any way to achieve good results while bypassing the intermediate pdf -> high-res png step? I need to do this for a large number of pdfs, so I'm trying to minimize the compute time.

Here are links to the images referenced above:

  1. test.pdf
  2. pdf_to_lowres.jpg
  3. pdf_to_highres.png
  4. pdf_to_highres_to_lowres.jpg
like image 406
Sharad Goel Avatar asked Sep 21 '11 17:09

Sharad Goel


2 Answers

One option that seems to improve the output a lot: -dDOINTERPOLATE. Here's what I got by running the same command as you but with the -dDOINTERPOLATE option:

JPEG with -dDOINTERPOLATE

I'm not sure what interpolation method this uses but it seems pretty good, especially in comparison to the results without it.

P.S. Consider outputting PNG images (-sDEVICE=pnggray) instead of JPEG. For most PDF documents (which tend to have just a few solid colors) it's a more appropriate choice.

like image 82
Jordan Running Avatar answered Sep 30 '22 22:09

Jordan Running


Your PDF looks like it is just a wrapper around a jpeg already.

Try using the pdfimages program from xpdf to extract the actual image rather than rendering to a file.

like image 41
David Avatar answered Sep 30 '22 22:09

David