Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert HTML file to PDF file using ITextSharp

I'd like to accomplish the following:

Given the path name of an html file, and the desired pathname of a pdf file, convert the HTML file to PDF using ITextSharp. I've seen plenty of code samples which do close to this but not exactly what I need. I believe my solution will need to use the iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList() function but I'm having trouble getting this to work with an actual HTML file and outputting an actual PDF file.

public void GeneratePDF(string htmlFileName, string outputPDFFileName)
{...}

is the function I'd really like to get working properly.

Thanks in advance

Edit: Here's an example I've of what I've tried:

iTextSharp.text.Document doc = new Document();
        PdfWriter.GetInstance(doc, new FileStream(Path.GetFullPath("fromHTML.pdf"), FileMode.Create));

        doc.Open();

        try
        {
            List<IElement> list = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StringReader(File.ReadAllText(this.textBox1.Text)), null);
            foreach (IElement elm in list)
            {
                doc.Add(elm);
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message);
        }

        doc.Close();

Note that textBox1.Text contains the full path name of the html file I'm trying to convert to pdf and I want this to get output to "fromHTML.pdf"

Thanks!

like image 797
Ben Avatar asked Dec 08 '10 21:12

Ben


People also ask

What is iTextSharp used for?

Itextsharp is an advanced tool library which is used for creating complex pdf repors. itext is used by different techonologies -- Android , . NET, Java and GAE developer use it to enhance their applications with PDF functionality.

Is iTextSharp compatible with .NET core?

The version which works with . NET core is iTextSharp. LGPLv2. Core which is free and what I use in the sample ASP.NET core project.


1 Answers

I had the same requirement and was diverted to this page by Google but could not find a concrete answer. But after some head hitting and trials, i have been able to successfully convert the HTML code to PDF using iTextSharp library 5.1.1. The code that i have shared here also takes care of the img tags in HTML with relative paths. iTextSharp library throws an error if your img tags do not have absolute src. You an find the code here: http://am22tech.com/s/22/Blogs/post/2011/09/28/HTML-To-PDF-using-iTextSharp.aspx

Let me know if you need more information. The code is in c#.

like image 162
Soan Avatar answered Sep 30 '22 09:09

Soan