Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append multiple DOCX files together

Tags:

c#

docx

openxml

I need to use C# programatically to append several preexisting docx files into a single, long docx file - including special markups like bullets and images. Header and footer information will be stripped out, so those won't be around to cause any problems.

I can find plenty of information about manipulating an individual docx file with .NET Framework 3, but nothing easy or obvious about how you would merge files. There is also a third-party program (Acronis.Words) that will do it, but it is prohibitively expensive.

Update:

Automating through Word has been suggested, but my code is going to be running on ASP.NET on an IIS web server, so going out to Word is not an option for me. Sorry for not mentioning that in the first place.

like image 432
ShootTheCore Avatar asked Oct 29 '08 17:10

ShootTheCore


2 Answers

In spite of all good suggestions and solutions submitted, I developed an alternative. In my opinion you should avoid using Word in server applications entirely. So I worked with OpenXML, but it did not work with AltChunk. I added text to original body, I receive a List of byte[] instead a List of file names but you can easily change the code to your needs.

using System; using System.Collections.Generic; using System.Globalization; using System.IO; using System.Xml.Linq; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing;  namespace OfficeMergeControl {     public class CombineDocs     {         public byte[] OpenAndCombine( IList<byte[]> documents )         {             MemoryStream mainStream = new MemoryStream();              mainStream.Write(documents[0], 0, documents[0].Length);             mainStream.Position = 0;              int pointer = 1;             byte[] ret;             try             {                 using (WordprocessingDocument mainDocument = WordprocessingDocument.Open(mainStream, true))                 {                      XElement newBody = XElement.Parse(mainDocument.MainDocumentPart.Document.Body.OuterXml);                      for (pointer = 1; pointer < documents.Count; pointer++)                     {                         WordprocessingDocument tempDocument = WordprocessingDocument.Open(new MemoryStream(documents[pointer]), true);                         XElement tempBody = XElement.Parse(tempDocument.MainDocumentPart.Document.Body.OuterXml);                          newBody.Add(tempBody);                         mainDocument.MainDocumentPart.Document.Body = new Body(newBody.ToString());                         mainDocument.MainDocumentPart.Document.Save();                         mainDocument.Package.Flush();                     }                 }             }             catch (OpenXmlPackageException oxmle)             {                 throw new OfficeMergeControlException(string.Format(CultureInfo.CurrentCulture, "Error while merging files. Document index {0}", pointer), oxmle);             }             catch (Exception e)             {                 throw new OfficeMergeControlException(string.Format(CultureInfo.CurrentCulture, "Error while merging files. Document index {0}", pointer), e);             }             finally             {                 ret = mainStream.ToArray();                 mainStream.Close();                 mainStream.Dispose();             }             return (ret);         }     } } 

I hope this helps you.

like image 169
GRGodoi Avatar answered Oct 20 '22 15:10

GRGodoi


You don't need to use automation. DOCX files are based on the OpenXML Formats. They are just zip files with a bunch of XML and binary parts (think files) inside. You can open them with the Packaging API (System.IO.Packaging in WindowsBase.dll) and manipulate them with any of the XML classes in the Framework.

Check out OpenXMLDeveloper.org for details.

like image 37
Rob Windsor Avatar answered Oct 20 '22 16:10

Rob Windsor