I have 3 PDF documents that are generated on the fly by a legacy library that we use, and written to disk. What's the easiest way for my JAVA server code to grab these 3 documents and turn them into one long PDF document where it's just all the pages from document #1, followed by all the pages from document #2, etc.
Ideally I would like this to happen in memory so I can return it as a stream to the client, but writing it to disk is also an option.
@J D OConal, thanks for the tip, the article you sent me was very outdated, but it did point me towards iText. I found this page that explains how to do exactly what I need: http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html
Thanks for the other answers, but I don't really want to have to spawn other processes if I can avoid it, and our project already has itext.jar, so I'm not adding any external dependancies
Here's the code I ended up writing:
public class PdfMergeHelper {
/**
* Merges the passed in PDFs, in the order that they are listed in the java.util.List.
* Writes the resulting PDF out to the OutputStream provided.
*
* Sample Usage:
* List<InputStream> pdfs = new ArrayList<InputStream>();
* pdfs.add(new FileInputStream("/location/of/pdf/OQS_FRSv1.5.pdf"));
* pdfs.add(new FileInputStream("/location/of/pdf/PPFP-Contract_Genericv0.5.pdf"));
* pdfs.add(new FileInputStream("/location/of/pdf/PPFP-Quotev0.6.pdf"));
* FileOutputStream output = new FileOutputStream("/location/to/write/to/merge.pdf");
* PdfMergeHelper.concatPDFs(pdfs, output, true);
*
* @param streamOfPDFFiles the list of files to merge, in the order that they should be merged
* @param outputStream the output stream to write the merged PDF to
* @param paginate true if you want page numbers to appear at the bottom of each page, false otherwise
*/
public static void concatPDFs(List<InputStream> streamOfPDFFiles, OutputStream outputStream, boolean paginate) {
Document document = new Document();
try {
List<InputStream> pdfs = streamOfPDFFiles;
List<PdfReader> readers = new ArrayList<PdfReader>();
int totalPages = 0;
Iterator<InputStream> iteratorPDFs = pdfs.iterator();
// Create Readers for the pdfs.
while (iteratorPDFs.hasNext()) {
InputStream pdf = iteratorPDFs.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages += pdfReader.getNumberOfPages();
}
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF
// data
PdfImportedPage page;
int currentPageNumber = 0;
int pageOfCurrentReaderPDF = 0;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
// Loop through the PDF files and add to the output.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
// Create a new page in the target for each source page.
while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
cb.addTemplate(page, 0, 0);
// Code for pagination.
if (paginate) {
cb.beginText();
cb.setFontAndSize(bf, 9);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "" + currentPageNumber + " of " + totalPages,
520, 5, 0);
cb.endText();
}
}
pageOfCurrentReaderPDF = 0;
}
outputStream.flush();
document.close();
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document.isOpen()) {
document.close();
}
try {
if (outputStream != null) {
outputStream.close();
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
}
I've used pdftk to great effect. It's an external application that you'll have to run from your java app.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With