Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert HTML with images to PDF using iText

I have searched the questions and have not been able to find a solution to my specific problem. What I need to do is convert HTML files that contain images and CSS styling to PDF. I am using iText 5 and have been able to include the styling into the generated PDF. However, I am still struggling including the images. I have included my code below. The image with the absolute path is included in the generated PDF, the image with the relative path is not. I know I need to implement AbstractImageProvider, but I do not know how to do it. Any help is greatly appreciated.

Java File:

public class Converter {

    static String in = "C:/Users/APPS/Desktop/Test_Html/index.htm";
    static String out = "C:/Users/APPS/Desktop/index.pdf";
    static String css = "C:/Users/APPS/Desktop/Test_Html/style.css";

    public static void main(String[] args) {
        try {
            convertHtmlToPdf();
        } catch (DocumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private static void convertHtmlToPdf() throws DocumentException, IOException {
        Document document = new Document();
        PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream(out));
        document.open();
        XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, document, new FileInputStream(in), new FileInputStream(css));
        document.close();
        System.out.println("PDF Created!");
    }

    /**
     * Not sure how to implement this
     * @author APPS
     *
     */
    public class myImageProvider extends AbstractImageProvider {

        @Override
        public String getImageRootPath() {
            // TODO Auto-generated method stub
            return null;
        }

    }

}

Html File:

<!DOCTYPE html>
<html lang="en">

<head>
    <title>HTML to PDF</title>
    <link href="style.css" rel="stylesheet" type="text/css" />
</head>

<body>
    <h1>HTML to PDF</h1>
    <p>
        <span class="itext">itext</span> 5.4.2
        <span class="description"> converting HTML to PDF</span>
    </p>
    <table>
        <tr>
            <th class="label">Title</th>
            <td>iText - Java HTML to PDF</td>
        </tr>
        <tr>
            <th>URL</th>
            <td>http://wwww.someurl.com</td>
        </tr>
    </table>
    <div class="center">
        <h2>Here is an image</h2>
        <div>
            <img src="images/Vader_TFU.jpg" />
        </div>
        <div>
            <img src="https://www.w3schools.com/images/picture.jpg" alt="Mountain" />
        </div>
    </div>
</body>
</html>

Css File:

h1 {
    color: #ccc;
}

table tr td {
    text-align: center;
    border: 1px solid gray;
    padding: 4px;
}

table tr th {
    background-color: #84C7FD;
    color: #fff;
    width: 100px;
}

.itext {
    color: #84C7FD;
    font-weight: bold;
}

.description {
    color: gray;
}

.center {
    text-align: center;
}
like image 463
jdubicki Avatar asked Oct 17 '17 13:10

jdubicki


People also ask

How to export (convert) image to PDF using iTextSharp?

In this article I will explain with an example, how to export (convert) Image to PDF using iTextSharp in ASP.Net with C# and VB.Net. The Image file will be first uploaded using FileUpload control and saved into a Folder (Directory), then the Image file will be added into the iTextSharp PDF document and ultimately downloaded as PDF file in ASP.Net.

How to convert HTML to PDF using iText utility?

If you have the HTML content as a String, then you may want to use the following itext utility method to convert HTML to a pdf file. OutputStream fileOutputStream = new FileOutputStream ( "string-output.pdf" ); HtmlConverter.convertToPdf ( "<h1>Hello String Content!</h1>", fileOutputStream);

Why can’t I convert this HTML file to a PDF file?

On converting this HTML to a PDF file, the itext java library didn’t apply the fonts properly. In the above case, the image was loaded because it was found in the same path as the HTML file.

How to make a PDF version of an HTML document?

Using iText HTMLWorker, you can produce PDF version of an HTML document. The document must be simple. Many things like FORM elements or external images are not supported. Done with iText 5.4.1.


1 Answers

The following is based on iText5 5.5.12 version

Suppose you have this directory structure:

enter image description here

With this code and using latest iText5:

package converthtmltopdf;

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.net.FileRetrieve;
import com.itextpdf.tool.xml.net.FileRetrieveImpl;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.AbstractImageProvider;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
import com.itextpdf.tool.xml.pipeline.html.LinkProvider;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

/**
 *
 * @author george.mavrommatis
 */
public class ConvertHtmlToPdf {
    public static final String HTML = "C:\\Users\\zzz\\Desktop\\itext\\index.html";
    public static final String DEST = "C:\\Users\\zzz\\Desktop\\itext\\index.pdf";
    public static final String IMG_PATH = "C:\\Users\\zzz\\Desktop\\itext\\";
    public static final String RELATIVE_PATH = "C:\\Users\\zzz\\Desktop\\itext\\";
    public static final String CSS_DIR = "C:\\Users\\zzz\\Desktop\\itext\\";

    /**
     * Creates a PDF with the words "Hello World"
     * @param file
     * @throws IOException
     * @throws DocumentException
     */
    public void createPdf(String file) throws IOException, DocumentException {
        // step 1
        Document document = new Document();
        // step 2
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
        // step 3
        document.open();
        // step 4

        // CSS
        CSSResolver cssResolver =
                XMLWorkerHelper.getInstance().getDefaultCssResolver(false);
        FileRetrieve retrieve = new FileRetrieveImpl(CSS_DIR);
        cssResolver.setFileRetrieve(retrieve);

        // HTML
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
        htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
        htmlContext.setImageProvider(new AbstractImageProvider() {
            public String getImageRootPath() {
                return IMG_PATH;
            }
        });
        htmlContext.setLinkProvider(new LinkProvider() {
            public String getLinkRoot() {
                return RELATIVE_PATH;
            }
        });

        // Pipelines
        PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
        HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
        CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

        // XML Worker
        XMLWorker worker = new XMLWorker(css, true);
        XMLParser p = new XMLParser(worker);
        p.parse(new FileInputStream(HTML));

        // step 5
        document.close();
    }
    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws IOException, DocumentException {
        // TODO code application logic here
        new ConvertHtmlToPdf().createPdf(DEST);
    }

}

And here is the result:

enter image description here

This example uses code from: https://developers.itextpdf.com/examples/xml-worker-itext5/xml-worker-examples

Hope this helps

like image 121
MaVRoSCy Avatar answered Oct 01 '22 07:10

MaVRoSCy