I am trying to generate a .pdf from html using the library ITextSharp. I am able to create the pdf with the html text converted to pdf text/paragraphs
My Problem: The pdf does not show my images(my img elements from the html). All my img html elements in my html dont get displayed in the pdf? Is it possible for ITextSharp to parse HTML & display images. I really hope so otherwise I am stuffed :(
I am linking to the correct directory where the images are(using IMG_BASURL) but they are just not showing
My code:
// mainContents variable is a string containing my HTML
var document = new Document(PageSize.A4, 50, 50, 80, 100);
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
document.open();
Hashtable providers = new Hashtable();
providers.Add("img_baseurl","C:/users/xx/VisualStudio/Projects/myproject/");
var parsedHtmlElements = HTMLWorker.ParseToList(new StringReader(mainContents), null, providers);
foreach (var htmlElement in parsedHtmlElements)
document.Add(htmlElement as IElement);
document.Close();
Every time that I've encountered this the problem was that the image was too large for the canvas. More specifically, even a naked IMG
tag internally will get wrapped in a Chunk
that will get wrapped in a Paragraph
, and I think that the image is overflowing the Paragraph but I'm not 100% sure.
The two easy fixes are to either enlarge the canvas or to specify image dimensions on the HTML IMG
tag. The third more complex route would be to use an additional provider IMG_PROVIDER
. To do this you need to implement the IImageProvider
interface. Below is a very simple version of one
public class ImageThing : IImageProvider {
//Store a reference to the main document so that we can access the page size and margins
private Document MainDoc;
//Constructor
public ImageThing(Document doc) {
this.MainDoc = doc;
}
Image IImageProvider.GetImage(string src, IDictionary<string, string> attrs, ChainedProperties chain, IDocListener doc) {
//Prepend the src tag with our path. NOTE, when using HTMLWorker.IMG_PROVIDER, HTMLWorker.IMG_BASEURL gets ignored unless you choose to implement it on your own
src = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + @"\" + src;
//Get the image. NOTE, this will attempt to download/copy the image, you'd really want to sanity check here
Image img = Image.GetInstance(src);
//Make sure we got something
if (img == null) return null;
//Determine the usable area of the canvas. NOTE, this doesn't take into account the current "cursor" position so this might create a new blank page just for the image
float usableW = this.MainDoc.PageSize.Width - (this.MainDoc.LeftMargin + this.MainDoc.RightMargin);
float usableH = this.MainDoc.PageSize.Height - (this.MainDoc.TopMargin + this.MainDoc.BottomMargin);
//If the downloaded image is bigger than either width and/or height then shrink it
if (img.Width > usableW || img.Height > usableH) {
img.ScaleToFit(usableW, usableH);
}
//return our image
return img;
}
}
To use this provider just add it to the provider collection like you did with HTMLWorker.IMG_BASEURL
:
providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));
It should be noted that if you use HTMLWorker.IMG_PROVIDER
that you are responsible for figuring out everything about the image. The code above assumes that all image paths need to be prepended with a constant string, you'll probably want to update this and check for HTTP
at the start. Also, because we're saying that we want to completely handle the image processing pipeline the provider HTMLWorker.IMG_BASEURL
is no longer needed.
The main code loop would now look something like this:
string html = @"<img src=""Untitled-1.png"" />";
string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.A4, 50, 50, 80, 100)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
using (StringReader sr = new StringReader(html)) {
System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));
var parsedHtmlElements = HTMLWorker.ParseToList(sr, null, providers);
foreach (var htmlElement in parsedHtmlElements) {
doc.Add(htmlElement as IElement);
}
}
doc.Close();
}
}
}
One last thing, make sure to specify which version of iTextSharp you are targetting when posting here. The code above targets iTextSharp 5.1.2.0 but I think you might be using the 4.X series.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With