Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge several PDF together and create new PDF/A with Apache PdfBox

Tags:

pdf

pdfbox

I'm using the Apache PdfBox to preset several non PDF/A forms and use the PDFMergerUtility to merge these PDFs together and create a byte array of the new PDF.


PDFMergerUtility mergerUtility = new PDFMergerUtility();

// presetting forms of these PDFs is omitted for readability
mergerUtility.addSource(new File("a.pdf"));
mergerUtility.addSource(new File("b.pdf"));
mergerUtility.addSource(new File("c.pdf"));

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
mergerUtility.setDestinationStream(outputStream);

try {    
  mergerUtility.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
  return outputStream.toByteArray();
} catch (IOException ex) {
  log.error("Unable to merge documents", ex);
  throw new RuntimeException("Unable to merge", ex);
}

Is there a way to tell the PDFMergerUtility to create a valid PDF/A document that cannot be modified anymore?

like image 807
saw303 Avatar asked Jan 20 '26 02:01

saw303


1 Answers

Converting an existing PDF "from the wild" into a compliant PDF/A is a very complex topic, unless you created it yourself. I rather suggest you use a product from Callas or PDF-Tools or another company.

(PDFBox has a command line tool (preflight) to check whether your PDF is PDF/A-1b compliant or not. This can give you a taste of all the shortcomings in "ordinary" PDFs)

like image 127
Tilman Hausherr Avatar answered Jan 23 '26 21:01

Tilman Hausherr