Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import doc and docx files in .Net and C#

I'm writing a text editor and I want to add the possibility to import .doc and .docx files. I know that I could use OLE Automation, but if I use a recent OLE library, it won't work with those people with an older version of Word, and if instead I use an older version, it won't be able to read .docx files. Any ideas? Thanks

EDIT: Another solution would be that, like my application works with HTML and RTF, convert .doc and .docx files with command line to one of these formats, something like this: http://www.snee.com/bobdc.blog/ 2007/09/using-word-for-command-line-co.html

like image 519
Javier Marín Avatar asked Jan 01 '26 08:01

Javier Marín


2 Answers

It's works with the Office 2003 PIA, tested in my computer running Office 2010:

using System.IO;
using System.Reflection;
using Microsoft.Office.Interop.Word;

public string GetHtmlFromDoc(string path)
    var wordApp = new Application {Visible = false};

//Cargar documento
            object srcPath = path;
            var wordDoc = wordApp.Documents.Open(ref srcPath);

            //Guardarlo en HTML
            string destPath = Path.Combine(Path.GetTempPath(), "word" + (new Random().Next()) + ".html");
            if (wordDoc != null)
            {
                object oDestPath = destPath;
                object exportFormat = WdSaveFormat.wdFormatHTML;
                wordDoc.SaveAs(ref oDestPath, ref exportFormat);
            }

            //Cerrar
            wordDoc.Close();
            wordApp.Quit();

            //Comprobar que el archivo existe);
            if (File.Exists(destPath))
            {
               return File.ReadAllText(destPath, Encoding.Default);
}
return null;
}
like image 122
Javier Marín Avatar answered Jan 03 '26 21:01

Javier Marín


Why don't you use the Office Primary Interop Assemblies (PIAs)?

I think you will have to decide which versions of Word you want to support. I suggest you settle on Word 2003 as the lowest. That will allow you to use the Office 2003 PIAs and program against them. Installing PIAs in a machine installs binding redirects as well, so they work with newer versions on Word. There should be no problem in opening .docx files with Word 2007 or 2010 through Office 2003 PIAs, although I haven't tried this myself.

like image 32
CesarGon Avatar answered Jan 03 '26 21:01

CesarGon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!