Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ImageMagick pdf to black and white pdf

I would like to convert a pdf file to a Black and White PDF file with ImageMagick. But I've got two problems: I use this command:

convert -colorspace Gray  D:\in.pdf D:\out.pdf
  1. But this command convert only the FIRST page... How to convert all pages?
  2. After use this command the resolution is terrible... but if I use -density 300 option the file size has increased more than double. So I would like to use the same DPI setting, but how to use?

Thanks a lot

like image 837
szuniverse Avatar asked Nov 04 '22 05:11

szuniverse


1 Answers

Assuming you have all the necessary command line tools installed you can do the following:

  1. Split and join PDF using pdfseparate and pdfunite (Poppler tools).
  2. Extract the original density using pdfinfo plus grep/egrep and, for instance, sed. This will not guarantee the same size of the PDF file, just the same DPI.

Putting it all together you can have a series of bash commands as following:

pdfseparate in.pdf temp-%d.pdf; for i in $(seq $(ls -1 temp-*.pdf | wc -l)); do mv temp-$i.pdf temp-$(printf %03d $i).pdf; done
for f in temp-*.pdf; do convert -density $(pdfinfo $f | egrep -o 'Page size:[[:space:]]*[0-9]+(\.[0-9]+)?[[:space:]]*x[[:space:]]*[0-9]+(\.[0-9]+)?' | sed -e 's/^Page size:\s*//'| sed -e 's/\s*x\s*/x/') -colorspace Gray {,bw-}$f; done
pdfunite bw-temp-*.pdf out.pdf
rm {bw-,}temp-*.pdf

Note 1: there as a dirty workaround (for/wc/seq/printf) for a proper ordering of 10-999 pages PDFs (I did not figure out how to put leading zeros in pdfseparate).

Note 2: I guess ImageMagick treats PDFs as just another binary image file so for instance for mainly text files this will result in huge PDFs. Thus, this is a very bad method to convert text-based PDFs to B&W.

like image 66
trybik Avatar answered Nov 08 '22 09:11

trybik