Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combining pdf files with ghostscript, how to include original file names?

I have about 250 single-page pdf files that have names like:

file_1_100.pdf,
file_1_200.pdf, 
file_1_300.pdf, 
file_2_100.pdf, 
file_2_200.pdf, 
file_2_300.pdf, 
file_3_100.pdf, 
file_3_200.pdf, 
file_3_300.pdf
...etc

I am using the following command to combine them to a single pdf file:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf file*pdf

It works perfectly, combining them in the correct order. However, when I am looking at finished.pdf, I want to have a reference that tells me the orignal filename for each page.

Does anyone have any suggestions? Can I add page names referencing the files or something?

like image 729
Stephen Avatar asked Aug 18 '11 03:08

Stephen


People also ask

How do I merge PDF files in Ghostscript?

ghostscript is commonly/typically found pre-installed on unix-like operating systems (e.g. linux, MacOS) and supports a command-line invocation for merging multiple PDF files into a single PDF file: gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combined. pdf -dBATCH pdf1. pdf pdf2.

How do I combine the contents of two PDF files?

Open Acrobat to combine files: Open the Tools tab and select "Combine files." Add files: Click "Add Files" and select the files you want to include in your PDF. You can merge PDFs or a mix of PDF documents and other files.

Can you merge documents in PDF filler?

pdfFiller offers you a quick and easy way to combine PDF documents into a single file without having to install any software. The PDF merger helps you keep your records organized and consistent, so you don't have to spend time trying to find all project-related documents.


2 Answers

It is fairly easy to put the file names into a list of Bookmarks which many PDF viewers can display.

This is done with PostScript using the 'pdfmark' distiller operator. For example, use the following

gs -sDEVICE=pdfwrite -o finished.pdf control.ps

where control.ps contains PS commands to print the pages and output the bookmark (/OUT) pdfmarks:

(examples/tiger.eps) run [ /Page 1 /Title (tiger.eps) /OUT pdfmark
(examples/colorcir.ps) run [ /Page 2 /Title (colorcir.ps) /OUT pdfmark

Note that you can also perform the enumeration using PS to automate the entire process:

/PN 1 def
(file*.pdf) {
  /FN exch def
  FN run
  [ /Page PN /Title FN /OUT pdfmark % do the file and bookmark it by filename
  /PN PN 1 add def % bump the page number
} 1000 string filenameforall

NB that the order of filenameforall enumeration is not specified, so you may want to sort the list to control the order, using the Ghostscript extension .sort ( array lt .sort lt ).

Also after thinking about this, I also realized that if an imput file has more than one page, there is a better way to set the bookmark to the correct page number using the 'PageCount' device property.

[
  (file*.pdf) { dup length string copy } 1000 string filenameforall
] % create array of filenames
{ lt } .sort % sort in increasing alphabetic order
/PN 1 def
{ /FN exch def
  /PN currentpagedevice /PageCount get 1 add def % get current page count done (next is one greater)
  FN run [ /Page PN /Title FN /OUT pdfmark % do the file and bookmark it by filename
} forall

The above creates an array of strings (copying them to unique string objects since filenameforall just overwrites the string it is given), then sorts it, and finally processes the array of strings using the forall operator. By using the PageCount device property to get the count of pages already produced, the page number (PN) for the bookmark will be correct. I have tested this snippet as 'control.ps'.

like image 178
Ray Johnston Avatar answered Oct 01 '22 11:10

Ray Johnston


To stamp the filename on each page you can use a combination of ghostscript and pdftk. Taken from https://superuser.com/questions/171790/print-pdf-file-with-file-path-in-footer

gs \
-o outdir\footer.pdf \
-sDEVICE=pdfwrite \
-c "5 5 moveto /Helvetica findfont 9 scalefont setfont (foobar-filename.pdf) show"

pdftk \
foobar-filename.pdf \
stamp outdir\footer.pdf \
output outdir\merged_foobar-filename.pdf
like image 20
matt wilkie Avatar answered Oct 01 '22 11:10

matt wilkie