I have some .pdf files with more than 500 pages, but I need only a few pages in each file. It is necessary to preserve document`s title pages. I know exactly the numbers of the pages that program should remove. How I can do it using Python 2.7 Environment, which is installed upon MS Visual Studio?
PyMuPDF library makes the code easy to delete pages from any PDF file. We can delete a single page as well as multiple pages from PDF. We can also use the list to delete pages from PDF. At first, we will import the 'Fitz' library from the package.
Choose “Tools” > “Organize Pages.” Or, select “Organize Pages” from the right pane. Select pages to delete: Click the page thumbnail of any page or pages you want to delete, then click the “Delete” icon to remove the page or pages from the file.
There are a lot of different kinds of data to decode when opening a PDF file! Fortunately, the Python ecosystem has some great packages for reading, manipulating, and creating PDF files.
Try using PyPDF2.
Instead of deleting pages, create a new document and add all pages which you don't want to delete.
Some sample code (originally adapted from BinPress which is dead, archived here).
from PyPDF2 import PdfFileWriter, PdfFileReader pages_to_keep = [1, 2, 10] # page numbering starts from 0 infile = PdfFileReader('source.pdf', 'rb') output = PdfFileWriter() for i in pages_to_keep: p = infile.getPage(i) output.addPage(p) with open('newfile.pdf', 'wb') as f: output.write(f)
or
from PyPDF2 import PdfFileWriter, PdfFileReader pages_to_delete = [3, 4, 5] # page numbering starts from 0 infile = PdfFileReader('source.pdf', 'rb') output = PdfFileWriter() for i in range(infile.getNumPages()): if i not in pages_to_delete: p = infile.getPage(i) output.addPage(p) with open('newfile.pdf', 'wb') as f: output.write(f)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With