Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge PDF's with PDFTK with Bookmarks?

Using pdftk to merge multiple pdf's is working well. However, any easy way to make a bookmark for each pdf merged?

I don't see anything on the pdftk docs regarding this so I don't think it's possible with pdftk.

All of our files merged will be 1 page, so wondering if there's any other utility that can add in bookmarks afterwards?

Or another linux based pdf utility that will allow to merge while specifying a bookmark for each individual pdf.

like image 610
Jason Avatar asked Jun 03 '10 20:06

Jason


People also ask

How do I merge PDF files with bookmarks?

In Adobe Acrobat Pro click on “File” in the upper left hand corner. From this menu you will scroll down and select “Combine” > “Merge Files into a single PDF” from the pop-out menu. Drag and drop files to add them, and then arrange them in the order you want.

How do I combine PDF files without losing bookmarks?

In the first picklist, select Combine files into one PDF. Click the Options button. Check Add file names as bookmarks. Click OK.

How do you unify multiple PDFs?

Go to File > New Document. Choose the option to Combine Files into a Single PDF. Drag the files that you want to combine into a single PDF into the file-list box. You can add a variety of file types, including PDFs, text files, images, Word, Excel, and PowerPoint documents.


1 Answers

You can also merge multiple PDFs with Ghostscript. The big advantage of this route is that a solution is easily scriptable, and it does not require a real programming effort:

gswin32c.exe ^           -dBATCH -dNOPAUSE ^           -sDEVICE=pdfwrite ^           -sOutputFile=merged.pdf ^           [...more Ghostscript options as needed...] ^           input1.pdf input2.pdf input3.pdf [....] 

With Ghostscript you'll be able to pass pdfmark statements which can add a Table of Content as well as bookmarks for each additional source file going into the resulting PDF. For example:

gswin32c.exe ^           -dBATCH -dNOPAUSE ^           -sDEVICE=pdfwrite ^           -sOutputFile=merged.pdf ^           [...more Ghostscript options as needed...] ^           file-with-pdfmarks-to-generate-a-ToC.ps ^           -f input1.pdf input2.pdf input3.pdf [....] 

or

gswin32c.exe ^           -dBATCH -dNOPAUSE ^           -sDEVICE=pdfwrite ^           -sOutputFile=merged.pdf ^           [...more Ghostscript options as needed...] ^           file-with-pdfmarks-to-generate-a-ToC.ps ^           -f input1.pdf ^              input2.pdf ^               input3.pdf [....] 

For some introduction to the pdfmark topic, see also Thomas Merz's PDFmark Primer.


Edit:
I had wanted to give you an example for file-with-pdfmarks-to-generate-a-ToC.ps, but somehow forgot it. Here it is:

[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark [/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark [/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark [/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark  

This would create a ToC for the first 4 files == first 4 pages (since you guarantee your ingredient files are 1 page each for your merged output PDF).

  1. The [/XYZ null null null] part makes sure your page viewport and zoom level does not change from the current one when you follow the link. (You could say [/XYZ 222 111 2] to do this, if you want an arbitrary example.)
  2. The /Title (some string you want) thingie determines what text is in the ToC.

And, you could even add these parameters to the Ghostscript commandline directly:

gswin32c.exe ^        -o merged.pdf ^        [...more Ghostscript options as needed...] ^        -c "[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark" ^        -c "[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark" ^        -c "[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark" ^        -c "[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark" ^        -f input1.pdf ^           input2.pdf ^            input3.pdf ^            input4.pdf [....] 



'nother Edit:

Oh, and by the way: Ghostscript does preserve the bookmarks when you use it to merge two PDF files into one -- pdftk.exe does not. Let's use the one generated by the command of my first edit (effectively concatenating 2 copies of the same file):

 gswin32c ^     -sDEVICE=pdfwrite ^     -o doublemerged.pdf ^      merged.pdf ^      merged.pdf 

The file doublemerged.pdf will now have 2*4 = 8 bookmarks.

  • What's as expected: bookmarks 1, 2, 3, and 4 link to pages 1, 2, 3 and 4.
  • The problem is, that bookmarks 5, 6, 7 and 8 also link at pages 1, 2, 3 and 4.

The reason is, that the pre-existing bookmarks did address their link targets by absolute page numbers. To work around that (and bookmarks work in merged files), one would have to generate bookmarks which do point to link targets by named destinations (and make sure these are uniq across documents which are merged).

(This approach also works on linux, just use gs instead of gswin32c.)


Appendix

Above command line uses [...more Ghostscript options as needed...] as a place holder for more options.

If you do not use other options, Ghostscript will apply its built-in defaults for various parameters. However, this may give you results which may not to your liking. Since Ghostscript generates a completely new PDF based on the input, this means that some of the original objects may be changed. This is true for color spaces and for image compression levels.

How to apply parameters which leave the originally embedded images unchanged can be seen over at SuperUser: "Use Ghostscript, but tell it to not reprocess images".

like image 198
Kurt Pfeifle Avatar answered Oct 05 '22 18:10

Kurt Pfeifle