Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using PdfBox, how do I retrieve contents of PDDocument as a byte array?

Tags:

I am currently using PdfBox as the driver for a pdf-file editor application. I need the contents of the PdfBox representation of a pdf file (PDDocument) as a byte array. Does anyone know how to do this?

like image 412
Gabriel Ruiu Avatar asked Jul 21 '12 14:07

Gabriel Ruiu


People also ask

What is PDFBox used for?

Apache PDFBox is an open source Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.

Is PDFBox thread safe?

Is PDFBox thread safe? No! Only one thread may access a single document at a time. You can have multiple threads each accessing their own PDDocument object.


1 Answers

I hope it's not too late...

ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); document.save(byteArrayOutputStream); document.close(); InputStream inputStream = new ByteArrayInputStream(byteArrayOutputStream.toByteArray()); 

And voila! You've got both input streams!

like image 69
JJS Avatar answered Oct 23 '22 19:10

JJS