Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert PDF files to images with PDFBox

Tags:

pdfbox

Can someone give me an example on how to use Apache PDFBox to convert a PDF file in different images (one for each page of the PDF)?

like image 838
user3423568 Avatar asked Apr 27 '14 17:04

user3423568


2 Answers

Solution for 1.8.* versions:

PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null); List<PDPage> pdPages = document.getDocumentCatalog().getAllPages(); int page = 0; for (PDPage pdPage : pdPages) {      ++page;     BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);     ImageIOUtil.writeImage(bim, pdfFilename + "-" + page + ".png", 300); } document.close(); 

Don't forget to read the 1.8 dependencies page before doing your build.

Solution for the 2.0 version:

PDDocument document = PDDocument.load(new File(pdfFilename)); PDFRenderer pdfRenderer = new PDFRenderer(document); for (int page = 0; page < document.getNumberOfPages(); ++page) {      BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);      // suffix in filename will be used as the file format     ImageIOUtil.writeImage(bim, pdfFilename + "-" + (page+1) + ".png", 300); } document.close(); 

The ImageIOUtil class is in a separate download / artifact (pdf-tools). Read the 2.0 dependencies page before doing your build, you'll need extra jar files for PDFs with jbig2 images, for saving to tiff images, and reading of encrypted files.

Make sure to use the latest version of whatever JDK version you are using, i.e. if you are using jdk8, then don't use version 1.8.0_5, use 1.8.0_191 or whatever is the latest at the time you're reading. Early versions were very slow.

like image 173
Tilman Hausherr Avatar answered Sep 16 '22 13:09

Tilman Hausherr


I tried it today with PdfBox 2.0.15.

import org.apache.pdfbox.pdmodel.*; import org.apache.pdfbox.rendering.*; import java.awt.image.*; import java.io.*; import javax.imageio.*;   public static void PDFtoJPG (String in, String out) throws Exception {     PDDocument pd = PDDocument.load (new File (in));     PDFRenderer pr = new PDFRenderer (pd);     BufferedImage bi = pr.renderImageWithDPI (0, 300);     ImageIO.write (bi, "JPEG", new File (out));  } 
like image 39
chris01 Avatar answered Sep 19 '22 13:09

chris01