Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the combination pdf2ps / ps2pdf shrink the PDF?

When researching how to compress a bunch of PDFs with pictures inside (ideally in a lossless fashion, but I'll settle for lossy) I found that a lot of people recommend doing this:

$ pdf2ps file.pdf
$ ps2pdf file.ps

This works! The resulting file is smaller and looks at least good enough.

  • How / why does this work?
  • Which settings can I tweak in this process?
  • If there is some lossy conversion, which one is that?
  • Where is the catch?
like image 982
vektor Avatar asked Apr 28 '15 14:04

vektor


1 Answers

People who recommend this procedure rarely do so from a background of expertise or knowledge -- it's rather based on gut feelings.

The detour of generating a new PDF via PostScript and back (also called "refrying a PDF") is never going to give you the optimal results. Sometimes it is useful, f.e. in cases were the original PDF isn't printed at all, or cannot be processed by another application. But these cases are very rare.

In any case, this "roundtrip" conversion will never lead to the same PDF file as initially.

Also the pdf2ps and ps2pdf tools aren't an independent tools at all: they are just simple wrapper scripts around a Ghostscript (gs or gswin32c.exe) command line. You can check that yourself by doing:

cat $(which ps2pdf)
cat $(which pdf2ps)

This will also reveal the (default) parameters these simple wrappers use for the respective conversions.

If you are unlucky, you will have an ancient Ghostscript installed. The PostScript which is then generated by pdf2ps will be Level 1 PS, and this will be "lossy" for many fonts which could be used by more modern PDF files, resulting in rasterization of previous vector fonts. Not exactly the output you'd like to look at...

Since both tools are using Ghostscript anyway (but behind your back), you are better off to run Ghostscript yourself. This gives you more control over the parameters it uses. Especially advantageous is the fact that this way you can get a direct PDF->PDF conversion, without any detour via an intermediary PostScript file format.

Here are a few answers which would give you some hints about what parameters you could use in order to drive the file size down in a semi-controlled way in your output PDF:

  • Optimize PDF files (with Ghostscript or other) (StackOverflow)
  • Remove / Delete all images from a PDF using Ghostscript or ImageMagick (StackOverflow)
like image 101
Kurt Pfeifle Avatar answered Oct 16 '22 16:10

Kurt Pfeifle