Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to combine a series of PDFs into one using Ruby?

Tags:

ruby

pdf

I have a series of PDFs named sequentially like so:

  • 01_foo.pdf
  • 02_bar.pdf
  • 03_baz.pdf
  • etc.

Using Ruby, is it possible to combine these into one big PDF while keeping them in sequence? I don't mind installing any necessary gems to do the job.

If this isn't possible in Ruby, how about another language? No commercial components, if possible.


Update: Jason Navarrete's suggestion lead to the perfect solution:

Place the PDF files needing to be combined in a directory along with pdftk (or make sure pdftk is in your PATH), then run the following script:

pdfs = Dir["[0-9][0-9]_*"].sort.join(" ")
`pdftk #{pdfs} output combined.pdf`

Or I could even do it as a one-liner from the command-line:

ruby -e '`pdftk #{Dir["[0-9][0-9]_*"].sort.join(" ")} output combined.pdf`'

Great suggestion Jason, perfect solution, thanks. Give him an up-vote people.

like image 362
Charles Roper Avatar asked Sep 17 '08 17:09

Charles Roper


3 Answers

A Ruby-Talk post suggests using the pdftk toolkit to merge the PDFs.

It should be relatively straightforward to call pdftk as an external process and have it handle the merging. PDF::Writer may be overkill because all you're looking to accomplish is a simple append.

like image 134
Jason Navarrete Avatar answered Sep 19 '22 22:09

Jason Navarrete


You can do this by converting to PostScript and back. PostScript files can be concatenated trivially. For example, here's a Bash script that uses the Ghostscript tools ps2pdf and pdf2ps:

#!/bin/bash
for file in 01_foo.pdf 02_bar.pdf 03_baz.pdf; do
    pdf2ps $file - >> temp.ps
done

ps2pdf temp.ps output.pdf
rm temp.ps

I'm not familiar with Ruby, but there's almost certainly some function (might be called system() (just a guess)) that will invoke a given command line.

like image 24
Adam Rosenfield Avatar answered Sep 23 '22 22:09

Adam Rosenfield


If you have ghostscript on your platform, shell out and execute this command:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf <your source pdf files>

like image 2
Steve Hanov Avatar answered Sep 23 '22 22:09

Steve Hanov