Using pdfbox, is it possible to convert a PDF (or a PDF byte[]) into an image byte[]? I've looked through several examples online and the only ones I can find describe how either to directly write the converted file to the filesystem or to convert it to a Java AWT object.
I'd rather not incur the IO of writing an image file to the filesystem, read into a byte[], and then delete it.
So this I can do:
String destinationImageFormat = "jpg";
boolean success = false;
InputStream is = getClass().getClassLoader().getResourceAsStream("example.pdf");
PDDocument pdf = PDDocument.load( is, true );
int resolution = 256;
String password = "";
String outputPrefix = "myImageFile";
PDFImageWriter imageWriter = new PDFImageWriter();
success = imageWriter.writeImage(pdf,
destinationImageFormat,
password,
1,
2,
outputPrefix,
BufferedImage.TYPE_INT_RGB,
resolution);
As well as this:
InputStream is = getClass().getClassLoader().getResourceAsStream("example.pdf");
PDDocument pdf = PDDocument.load( is, true );
List<PDPage> pages = pdf.getDocumentCatalog().getAllPages();
for ( PDPage page : pages )
{
BufferedImage image = page.convertToImage();
}
Where I'm not clear on is how to tranform the BufferedImage into a byte[]. I know this is transformed into a file output stream in imageWriter.writeImage(), but I'm not clear on how the API works.
Add maven dependency:
<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.1</version>
</dependency>
And, conver a pdf to image:
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import javax.imageio.ImageIO;
private List<String> savePDF(String filePath) throws IOException {
List<String> result = Lists.newArrayList();
File file = new File(filePath);
PDDocument doc = PDDocument.load(file);
PDFRenderer renderer = new PDFRenderer(doc);
int pageSize = doc.getNumberOfPages();
for (int i = 0; i < pageSize; i++) {
String pngFileName = file.getPath() + "." + (i + 1) + ".png";
FileOutputStream out = new FileOutputStream(pngFileName);
ImageIO.write(renderer.renderImageWithDPI(i, 96), "png", out);
out.close();
result.add(pngFileName);
}
doc.close();
return result;
}
EDIT:
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import javax.imageio.ImageIO;
private List<String> savePDF(String filePath) throws IOException {
List<String> result = Lists.newArrayList();
File file = new File(filePath);
PDDocument doc = PDDocument.load(file);
PDFRenderer renderer = new PDFRenderer(doc);
int pageSize = doc.getNumberOfPages();
for (int i = 0; i < pageSize; i++) {
String pngFileName = file.getPath() + "." + (i + 1) + ".png";
ByteArrayOutputStream out = new ByteArrayOutputStream(pngFileName);
ImageIO.write(renderer.renderImageWithDPI(i, 96), "png", out);
out.toByteArray(); // here you can get a byte array
out.close();
result.add(pngFileName);
}
doc.close();
return result;
}
You can use ImageIO.write to write to an OutputStream. To get a byte[], use a ByteArrayOutputStream, then call toByteArray() on it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With