Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I programmatically remove a page from a PDF document on a Mac?

Tags:

macos

pdf

I have a bunch of PDF documents and all of them contain a title page that I want to remove.

Is there a way to programmatically remove them?

Most of the PDF utilities I found can only combine documents but not remove pages. In the print dialog I can choose page 2 to and then print to a file, but I can't find any way to access this function programmatically.

like image 933
ceiling cat Avatar asked Sep 29 '10 06:09

ceiling cat


People also ask

How do I remove certain pages from a PDF?

Open the PDF in Acrobat. Choose the Organize Pages tool from the right pane. The Organize Pages toolset is displayed in the secondary toolbar, and the page thumbnails are displayed in the Document area. Select a page thumbnail you want to delete and click the Delete icon to delete the page.

How do I delete a page from a PDF on a Mac?

Delete a page from a PDF: Choose View > Thumbnails or View > Contact Sheet, select the page or pages to delete, then press the Delete key on your keyboard (or choose Edit > Delete). When you delete a page from a PDF, all the annotations on the page are removed as well.


2 Answers

Use pdftk.

To remove page 8:

pdftk in.pdf cat 1-7 9-end output out.pdf
like image 154
Benoit Avatar answered Sep 23 '22 17:09

Benoit


Just for the record: you can also use Ghostscript:

gs \
  -o removed-page-1-from-input.pdf \
  -sDEVICE=pdfwrite \
  -dFirstPage=2 \
  /path/to/input.pdf

However, pdftk is the better tool for that job (and was already recommended to you).

Also, this Ghostscript commandline could change some of the properties in your input.pdf because it essentially re-distills it. This could be a desired change or not. To control individual aspects of this behavior (or to suppress some of them), a more complicated commandline with more parameters is required.

pdftk will re-use the original PDF objects for each page as-is.


Update

Ghostscript has the additional parameter of -dLastPage too. Together with -dFirstPage this allows for the extraction of page ranges.

The newest versions sport an new parameter, -sPageList. This could be used like this:

-sPageList="1, 5-10, 12-"

to extract pages 1, 5-10 and 12-last from the input document. However, I've not (yet) personally tested this new feature and I'm not sure how reliably it works.

For older versions of Ghostscript (as well as the most recent one), it should work to feed the same input PDF multiple times with different parameters to same GS call to extract non-contiguous page selections from a document. You could even combine pages from different documents this way:

gs \
  -o selected-pages.pdf \
  -sDEVICE=pdfwrite     \
  -dFirstPage=2         \
  -dLastPage=2          \
   in1.pdf              \
                        \
  -dFirstPage=10        \
  -dLastPage=15         \
   in1.pdf              \
                        \
  -dFirstPage=1         \
  -dLastPage=1          \
   in1.pdf              \
                        \
  -dFirstPage=4         \
  -dLastPage=6          \
   in2.pdf

Caveats: Combining pages from different documents which use non-embedded fonts or identical font names but different encodings and/or different subsets (with identical fontname-prefixes) may lead to a faulty PDF in the result.

like image 24
Kurt Pfeifle Avatar answered Sep 24 '22 17:09

Kurt Pfeifle