Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the numer of pdf pages in python that has blank pdf page also

Tags:

python-2.7

I have tried to print the count of pdf document which includes some blank white pdf page using pypdf module. But it avoids the blanks page and print the count of rest of pages. Below is the code.

import sys

import pyPdf

from pyPdf import PdfFileReader, PdfFileWriter

pdf_document = PdfFileReader(file(normalpdfpath,"r"))

normal = pdf_document.getNumPages()
print normal
like image 439
Deepan Avatar asked Oct 05 '22 02:10

Deepan


2 Answers

step 1:-

pip install pyPDF2

step 2:-

import requests, PyPDF2, io
url = 'sample.pdf' 
response = requests.get(url)
with io.BytesIO(response.content) as open_pdf_file:
  read_pdf = PyPDF2.PdfFileReader(open_pdf_file)
  num_pages = read_pdf.getNumPages()
  print(num_pages)
like image 133
Saleem Avatar answered Nov 12 '22 18:11

Saleem


You may try this, which worked for me:

import re
import os

rxcountpages = re.compile(r"/Type\s*/Page([^s]|$)", re.MULTILINE|re.DOTALL)

def count_pages(filename):
    data = file(filename,"rb").read()
    return len(rxcountpages.findall(data))

if __name__=="__main__":
    parent = "/Users/username/"
    os.chdir(parent)
    filename = 'LaTeX20120726.pdf'
    print count_pages(filename)

Regards

like image 30
Dinever Avatar answered Nov 12 '22 18:11

Dinever