I know there are already questions similar to this, and suggested Open XML and all.
I am using Open XMl but it work only with inline style.
is there any solution to this, or any other better way to convert html to docx other than Open XML.
Thanks!
You can inline a CSS file using a tool like the one described here.
Then, to perform the conversion (adapted from Eric White's blog):
using (WordprocessingDocument myDoc =
WordprocessingDocument.Open("ConvertedDocument.docx", true))
{
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
var chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Html, altChunkId);
using (FileStream fileStream = File.Open("YourHtmlDocument.html", FileMode.Open))
{
chunk.FeedData(fileStream);
}
AltChunk altChunk = new AltChunk() {Id = altChunkId};
mainPart.Document.Body.InsertAfter(
altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
mainPart.Document.Save();
}
This isn't exactly converting HTML to DOCX. It's appending YourHtmlDocument.html
to ConvertedDocument.docx
. If ConvertedDocument.docx
is initially empty this approach is effectively a conversion.
Whenever you use an AltChunk
to build a document, your HTML is embedded in the document until the next time the document is opened in Word. At that point, the HTML is converted to WordProcessingML
markup. This is really only an issue if the document won't be opened in MS Word. If you were uploading to Google docs, opening in OpenOffice, or using COM to convert to a PDF, OpenXML won't be sufficient. In that case, you'll probably need to resort to a paid tool like Aspose.Words.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With