Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write shell script for finding number of pages in PDF?

Tags:

shell

pdf

I am generating a PDF dynamically. How can I check the number of pages in the PDF using a shell script?

like image 754
Manish Avatar asked Feb 05 '13 09:02

Manish


People also ask

How do I find out how many pages are in a PDF?

Here is a R function that reports the PDF file page number by using the pdfinfo command. Show activity on this post. Show activity on this post. The R package pdftools and the function pdf_info() provides information on the number of pages in a pdf.

How do I count the number of pages in a PDF in PHP?

It does not use any PHP library for performing this task. The following line of code can be used. $pageCount = (new TCPDI())->setSourceData((string)file_get_contents($fileName)); Method 3: Using pdfinfo: For Linux users, there is a faster way to count the number of pages in a pdf document than “identity” function.


2 Answers

Without any extra package:

strings < file.pdf | sed -n 's|.*/Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' \     | sort -rn | head -n 1 

Using pdfinfo:

pdfinfo file.pdf | awk '/^Pages:/ {print $2}' 

Using pdftk:

pdftk file.pdf dump_data | grep NumberOfPages | awk '{print $2}' 

You can also recursively sum the total number of pages in all PDFs via pdfinfo as follows:

find . -xdev -type f -name "*.pdf" -exec pdfinfo "{}" ";" | \     awk '/^Pages:/ {n += $2} END {print n}' 
like image 198
Ocaso Protal Avatar answered Oct 02 '22 10:10

Ocaso Protal


The imagemagick library provides a tool called identify which in conjunction with counting the lines of output gets you what you are after...imagemagick is a easy install on osx with brew.

Here is a functional bash script that captures it to a shell variable and dumps it back to the screen...

#/bin/bash pdfFile=$1 echo "Processing $pdfFile" numberOfPages=$(/usr/local/bin/identify "$pdfFile" 2>/dev/null | wc -l | tr -d ' ') #Identify gets info for each page, dump stderr to dev null #count the lines of output #trim the whitespace from the wc -l outout echo "The number of pages is: $numberOfPages" 

And the output of running it...

$ ./countPages.sh aSampleFile.pdf  Processing aSampleFile.pdf The number of pages is: 2 $  
like image 39
np0x Avatar answered Oct 02 '22 11:10

np0x