I am trying to find a SDK that can generate PDF from OpenXml. I have used the Open Xml Power Tools to convert the open XML and html and and using iTextSharp to parse the Html to PDF. But the result is a very terrible looking PDF.
I have not yet tried the iText's RTF parser. If I go this direction, I will end up needing a RTF converter making the simple conversion a double step nightmare.
It almost looks like I might end up writing a custom converter based of power tools OpenXml to html converter. Any advise is appreciated. I really at this time can't end up going for a professional converter as the licenses are too expensive (Aspose Word/TxText).
I thought I will put some more effort into my investigation. I went back to the conversion utility "http://msdn.microsoft.com/en-us/library/ff628051.aspx" and looked through its code. Given the biggest thing it missed was reading the underlying styles and generate a style attribute. The PDF looked much better with the limitation of not handling custom true type font. More investigation tomorrow. I am hoping has done something like this/faced weird issues and can shed some light.
private static StringDictionary GetStyle(XElement el)
{
IEnumerable jcL = el.Elements(W.jc);
IEnumerable spacingL = el.Elements(W.spacing);
IEnumerable rPL = el.Elements(W.rPr);
StringDictionary sd = new StringDictionary();
if (HasAttribute(jcL, W.val)) sd.Add("text-align", GetAttribute(jcL, W.val));
// run prop exists
if (rPL.Count() > 0)
{
XElement r = rPL.First();
IEnumerable ftL = el.Elements(W.rFonts);
if (r.Element(W.b) != null) sd.Add("font-weight", "bolder");
if (r.Element(W.i) != null) sd.Add("font-style", "italic");
if (r.Element(W.u) != null) sd.Add("text-decoration", "underline");
if (r.Element(W.color) != null && HasAttribute(r.Element(W.color), W.val)) sd.Add("color", "#" + GetAttribute(r.Element(W.color), W.val));
if (r.Element(W.rFonts) != null )
{
//
if(HasAttribute(r.Element(W.rFonts), W.cs)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.cs));
else if (HasAttribute(r.Element(W.rFonts), W.hAnsi)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.hAnsi));
}
if (r.Element(W.sz) != null && HasAttribute(r.Element(W.sz), W.val)) sd.Add("font-size", GetAttribute(r.Element(W.sz), W.val) + "pt");
}
return sd.Keys.Count > 0 ? sd : null;
}
I don't know of any direct converter with source-code availabe, but yeah, my thought is that you may need to build a converter from scratch. Luckily (I guess), Word's WordprocessingML is the simplest of the Open XML formats and you can look to other projects for inspiration, such as:
For commercial & server-side solutions, you can use either Word Automations Services (requires SharePoint) or Apose.NET Words.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With