Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get DPI of a PDF file?

Using ImageMagick or GhostScript or any PHP code how can I get the DPI value of PDF files? Here is the link for two demo files

  • http://jmp.sh/O5g5wL4 -- of 72 DPI
  • http://jmp.sh/RxrnYrY -- of 300 DPI

I have used

$image = new Imagick();
$image->readImage('xyz.pdf');
$resolutions = $image->getImageResolution();

It gives the same result for two different PDF files having different DPI.

I have also used

pdfimages -list xyz.pdf

It gives a list of all information but how to fetch the DPI value from the list.

How to get the exact DPI value of a PDF?

like image 963
Swagat Pritam Sahoo Avatar asked Apr 24 '18 16:04

Swagat Pritam Sahoo


1 Answers

As fmw42 says PDF files themselves have no resolution. However in your case both the files consist of nothing but an image. In one case the image is ~48 MB and in the other its around 200 MB.

The reason is that the images have a different effective resolution.

In PDF the image is simply a bitmap, a sequence of coloured pixels. These are then drawn onto the underlying media. At this point there is no resolution, the pixels are laid down in a specific media size. In your case 22 inches by 82 inches.

The effective resolution is given by dividing the dimension by the number of pixels in the image in that dimension.

So if I have an image which is 1000x1000 pixels, and I draw it in a 1 inch square, then the effective resolution of the image is 1000 dpi. If I change my mind and draw it in a square 4 inches by 4 inches, then the effective resolution is 250 dpi.

The image hasn't changed, just the area it covers.

Now consider I have two images drawn in 1 inch squares. the first image is 1000x1000, the second is 500x500. The effective resolution of the first image is 1000 dpi, the effective resolution of the second is 500 dpi.

So you can see that, in PDF, the effective resolution of the image is a combination of the dimensions of the image, and the dimensions of the media it covers.

That's a difficult thing to measure in a PDF file. The area covered is calculated using matrix algebra and can be a combination of several different matrices.

The actual dimensions of the image, by contrast are quite easy to determine, they are given in the image dictionary. Your images are: 1620x5868 and 3372x12225. In both cases the media is the same size; 22.5x81.5 inches.

Since the images cover the entire media, the effective resolutions are;

1620/22.5 = 72 by 5868/81.5 = 72

3372/22.5 = 149.866 by 12225/81.5 = 150

I think MuPDF will give you image dimensions and media dimensions, assuming all your PDF files are constructed like this you can then simply perform the maths, but note that this won't be so simple for ordinary PDF files where images don't cover the entire media.

Using mutool info -I -M 150-dpi.pdf gives:

Retrieving info from pages 1-1...

Mediaboxes (1): 1 (6 0 R): [ 0 0 1620 5868 ]

Images (1): 1 (6 0 R): [ DCT ] 3375x12225 8bpc DevCMYK (12 0 R)

So there's your image dimensions and your media size. All you need to do is apply the division of one by the other.

Note: In debian and related distros, mutool is contained in mupdf-tools package, not in mupdf package itself. It can by therefore installed by sudo apt install mupdf-tools.

like image 147
KenS Avatar answered Jan 02 '23 20:01

KenS