Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating PDF from OpenXml

I am trying to find a SDK that can generate PDF from OpenXml. I have used the Open Xml Power Tools to convert the open XML and html and and using iTextSharp to parse the Html to PDF. But the result is a very terrible looking PDF.

I have not yet tried the iText's RTF parser. If I go this direction, I will end up needing a RTF converter making the simple conversion a double step nightmare.

It almost looks like I might end up writing a custom converter based of power tools OpenXml to html converter. Any advise is appreciated. I really at this time can't end up going for a professional converter as the licenses are too expensive (Aspose Word/TxText).


I thought I will put some more effort into my investigation. I went back to the conversion utility "http://msdn.microsoft.com/en-us/library/ff628051.aspx" and looked through its code. Given the biggest thing it missed was reading the underlying styles and generate a style attribute. The PDF looked much better with the limitation of not handling custom true type font. More investigation tomorrow. I am hoping has done something like this/faced weird issues and can shed some light.




    private static StringDictionary GetStyle(XElement el)
    {
        IEnumerable jcL = el.Elements(W.jc);
        IEnumerable spacingL = el.Elements(W.spacing);
        IEnumerable rPL = el.Elements(W.rPr);

        StringDictionary sd = new StringDictionary();

        if (HasAttribute(jcL, W.val)) sd.Add("text-align", GetAttribute(jcL, W.val));

        // run prop exists
        if (rPL.Count() > 0)
        {
            XElement r = rPL.First();
            IEnumerable ftL = el.Elements(W.rFonts);

            if (r.Element(W.b) != null) sd.Add("font-weight", "bolder");
            if (r.Element(W.i) != null) sd.Add("font-style", "italic");
            if (r.Element(W.u) != null) sd.Add("text-decoration", "underline");
            if (r.Element(W.color) != null && HasAttribute(r.Element(W.color), W.val)) sd.Add("color", "#" + GetAttribute(r.Element(W.color), W.val));
            if (r.Element(W.rFonts) != null )
            {
                // 
                if(HasAttribute(r.Element(W.rFonts), W.cs)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.cs));
                else if (HasAttribute(r.Element(W.rFonts), W.hAnsi)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.hAnsi));
            }
            if (r.Element(W.sz) != null && HasAttribute(r.Element(W.sz), W.val)) sd.Add("font-size", GetAttribute(r.Element(W.sz), W.val) + "pt");
        }

        return sd.Keys.Count > 0 ? sd : null;
    }


like image 309
Shrini Avatar asked Nov 13 '22 17:11

Shrini


1 Answers

I don't know of any direct converter with source-code availabe, but yeah, my thought is that you may need to build a converter from scratch. Luckily (I guess), Word's WordprocessingML is the simplest of the Open XML formats and you can look to other projects for inspiration, such as:

  1. TextGlow - Word to Silverlight converter
  2. Word to XAML Converter - Word to XAML converter (probably very similar to TextGlow above)
  3. OpenXML-DAISY - conversion to Daisy
  4. ODF Converter - convert from/to OpenOffice formats and OpenXML
  5. The XHTML solution by Eric White you already referenced.

For commercial & server-side solutions, you can use either Word Automations Services (requires SharePoint) or Apose.NET Words.

like image 90
Todd Main Avatar answered Dec 10 '22 10:12

Todd Main