Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pdf to byte[] and vice versa with pdfbox

I've read the documentation and the examples but I'm having a hard time putting it all together. I'm just trying to take a test pdf file and then convert it to a byte array then take the byte array and convert it back into a pdf file then create the pdf file onto disk.

It probably doesn't help much, but this is what I've got so far:

package javaapplication1;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import org.apache.pdfbox.cos.COSStream;
import org.apache.pdfbox.exceptions.COSVisitorException;
import org.apache.pdfbox.pdmodel.PDDocument;

public class JavaApplication1 {

    private COSStream stream;

    public static void main(String[] args) {
        try {
            PDDocument in = PDDocument.load("C:\\Users\\Me\\Desktop\\JavaApplication1\\in\\Test.pdf");
            byte[] pdfbytes = toByteArray(in);
            PDDocument out;
        } catch (Exception e) {
            System.out.println(e);
        }
    }

    private static byte[] toByteArray(PDDocument pdDoc) throws IOException, COSVisitorException {
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        try {
            pdDoc.save(out);
            pdDoc.close();
        } catch (Exception ex) {
            System.out.println(ex);
        }
        return out.toByteArray();
    }

    public void PDStream(PDDocument document) {
        stream = new COSStream(document.getDocument().getScratchFile());
    }
}
like image 720
ThreaT Avatar asked Jul 17 '13 19:07

ThreaT


People also ask

What is PDFBox used for?

PDFBox is an open-source library which is written in Java. It supports the development and conversion of PDF Documents. PDFBox Library comes as a JAR file. It allows the creation of new PDF documents, manipulation of existing documents, bookmarking PDF and the ability to extract content from PDF documents.


1 Answers

You can use Apache commons, which is essential in any java project IMO.

Then you can use FileUtils's readFileToByteArray(File file) and writeByteArrayToFile(File file, byte[] data).

(here is commons-io, which is where FileUtils is: http://commons.apache.org/proper/commons-io/download_io.cgi )

For example, I just tried this here and it worked beautifully.

try {
    File file = new File("/example/path/contract.pdf");
    byte[] array = FileUtils.readFileToByteArray(file);
    FileUtils.writeByteArrayToFile(new File("/example/path/contract2.pdf"), array);

} catch (IOException e) {
    e.printStackTrace();
}
like image 140
Doodad Avatar answered Sep 25 '22 15:09

Doodad