Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I embed any file type into Microsoft Word using OpenXml 2.0

I spent a lot of time trying to figure out a good way to embed any file into Microsoft Word using OpenXml 2.0; Office documents are fairly easy but what about other file types such as PDF, TXT, GIF, JPG, HTML, etc....

What is a good way to get this to work for any file type, in C#?

like image 768
D Lyonnais Avatar asked Jul 23 '10 20:07

D Lyonnais


2 Answers

Embedding Foreign Objects (PDF, TXT, GIF, etc…) into Microsoft Word using OpenXml 2.0 (Well, in collaboration with COM)

I got a lot from this site, so here I asked and answered my own question in order to give back a little on a topic in which I had difficulty finding answers on, hope it helps people.

There are several examples out there that show how to embed an Office Document into another Office Document using OpenXml 2.0, what’s not out there and easily understandable is how to embed just about any file into and Office Document.

I have learned a lot from other people’s code, so this is my attempt to contribute. Since I am already using OpenXml to generate documents, and I am in need of embedding other files into Word, I have decided use a collaboration of OpenXml and COM (Microsoft Office 2007 dll’s) to achieve my goal. If you are like me, “invoking the OLE server application to create an IStorage” doesn’t mean much to you.

In this example I’d like to show how I use COM to PROGRMATICALLY get the OLE-binary data information of the attached file, and then how I used that information within my OpenXml document. Basically, I am programmatically looking at the OpenXml 2.0 Document Reflector to get the information I need.

My code below is broken down into several classes, but here is an outline of what I am doing:

  1. Create an OpenXml WordProcessingDocument, get the System.IO.FileInfo for the file you want to Embed
  2. Create a custom OpenXmlEmbeddedObject object (this is what holds all the binary data)
  3. Use the binary data from the above step to create Data and Image Streams
  4. Use those Streams as the File Object and File Image for your OpenXml Document

I know there is a lot of code, and not much explanation… Hopefully it is easy to follow and will help people out 

Requirements: • DocumentFormat.OpenXml dll (OpenXml 2.0) • WindowsBase dll • Microsoft.Office.Interop.Word dll (Office 2007 – version 12)

• This the main class that starts everything, opens a WordProcessingDocument and class to have the file attached

using DocumentFormat.OpenXml.Packaging;
using System.IO;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Wordprocessing;

public class MyReport
{
    private MainDocumentPart _mainDocumentPart;

    public void CreateReport()
    {
        using (WordprocessingDocument wpDocument = WordprocessingDocument.Create(@"TempPath\MyReport.docx", WordprocessingDocumentType.Document))
        {
            _mainDocumentPart = wpDocument.AddMainDocumentPart();
            _mainDocumentPart.Document = new Document(new Body());

            AttachFile(@"MyFilePath\MyFile.pdf", true);
        }
    }

    private void AttachFile(string filePathAndName, bool displayAsIcon)
    {
        FileInfo fileInfo = new FileInfo(filePathAndName);

        OpenXmlHelper.AppendEmbeddedObject(_mainDocumentPart, fileInfo, displayAsIcon);
    }
}

• This class in an OpenXml helper class, holds all the logic to embed an object into your OpenXml File

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Validation;
using DocumentFormat.OpenXml.Wordprocessing;
using OVML = DocumentFormat.OpenXml.Vml.Office;
using V = DocumentFormat.OpenXml.Vml;

public class OpenXmlHelper
{
    /// <summary>
    /// Appends an Embedded Object into the specified Main Document
    /// </summary>
    /// <param name="mainDocumentPart">The MainDocument Part of your OpenXml Word Doc</param>
    /// <param name="fileInfo">The FileInfo object associated with the file being embedded</param>
    /// <param name="displayAsIcon">Whether or not to display the embedded file as an Icon (Otherwise it will display a snapshot of the file)</param>
    public static void AppendEmbeddedObject(MainDocumentPart mainDocumentPart, FileInfo fileInfo, bool displayAsIcon)
    {
        OpenXmlEmbeddedObject openXmlEmbeddedObject = new OpenXmlEmbeddedObject(fileInfo, displayAsIcon);

        if (!String.IsNullOrEmpty(openXmlEmbeddedObject.OleObjectBinaryData))
        {
            using (Stream dataStream = new MemoryStream(Convert.FromBase64String(openXmlEmbeddedObject.OleObjectBinaryData)))
            {
                if (!String.IsNullOrEmpty(openXmlEmbeddedObject.OleImageBinaryData))
                {
                    using (Stream emfStream = new MemoryStream(Convert.FromBase64String(openXmlEmbeddedObject.OleImageBinaryData)))
                    {
                        string imagePartId = GetUniqueXmlItemID();
                        ImagePart imagePart = mainDocumentPart.AddImagePart(ImagePartType.Emf, imagePartId);

                        if (emfStream != null)
                        {
                            imagePart.FeedData(emfStream);
                        }

                        string embeddedPackagePartId = GetUniqueXmlItemID();

                        if (dataStream != null)
                        {
                            if (openXmlEmbeddedObject.ObjectIsOfficeDocument)
                            {
                                EmbeddedPackagePart embeddedObjectPart = mainDocumentPart.AddNewPart<EmbeddedPackagePart>(
                                    openXmlEmbeddedObject.FileContentType, embeddedPackagePartId);
                                embeddedObjectPart.FeedData(dataStream);
                            }
                            else
                            {
                                EmbeddedObjectPart embeddedObjectPart = mainDocumentPart.AddNewPart<EmbeddedObjectPart>(
                                    openXmlEmbeddedObject.FileContentType, embeddedPackagePartId);
                                embeddedObjectPart.FeedData(dataStream);
                            }
                        }

                        if (!displayAsIcon && !openXmlEmbeddedObject.ObjectIsPicture)
                        {
                            Paragraph attachmentHeader = CreateParagraph(String.Format("Attachment: {0} (Double-Click to Open)", fileInfo.Name));
                            mainDocumentPart.Document.Body.Append(attachmentHeader);
                        }

                        Paragraph embeddedObjectParagraph = GetEmbeededObjectParagraph(openXmlEmbeddedObject.FileType,
                            imagePartId, openXmlEmbeddedObject.OleImageStyle, embeddedPackagePartId);

                        mainDocumentPart.Document.Body.Append(embeddedObjectParagraph);
                    }
                }
            }
        }
    }

    /// <summary>
    /// Gets Paragraph that includes the embedded object
    /// </summary>
    private static Paragraph GetEmbeededObjectParagraph(string fileType, string imageID, string imageStyle, string embeddedPackageID)
    {
        EmbeddedObject embeddedObject = new EmbeddedObject();

        string shapeID = GetUniqueXmlItemID();
        V.Shape shape = new V.Shape() { Id = shapeID, Style = imageStyle };
        V.ImageData imageData = new V.ImageData() { Title = "", RelationshipId = imageID };

        shape.Append(imageData);
        OVML.OleObject oleObject = new OVML.OleObject()
        {
            Type = OVML.OleValues.Embed,
            ProgId = fileType,
            ShapeId = shapeID,
            DrawAspect = OVML.OleDrawAspectValues.Icon,
            ObjectId = GetUniqueXmlItemID(),
            Id = embeddedPackageID
        };

        embeddedObject.Append(shape);
        embeddedObject.Append(oleObject);

        Paragraph paragraphImage = new Paragraph();

        Run runImage = new Run(embeddedObject);
        paragraphImage.Append(runImage);

        return paragraphImage;
    }

    /// <summary>
    /// Gets a Unique ID for an XML Item, for reference purposes
    /// </summary>
    /// <returns>A GUID string with removed dashes</returns>
    public static string GetUniqueXmlItemID()
    {
        return "r" + System.Guid.NewGuid().ToString().Replace("-", "");
    }

    private static Paragraph CreateParagraph(string paragraphText)
    {
        Paragraph paragraph = new Paragraph();
        ParagraphProperties paragraphProperties = new ParagraphProperties();

        paragraphProperties.Append(new Justification()
        {
            Val = JustificationValues.Left
        });

        paragraphProperties.Append(new SpacingBetweenLines()
        {
            After = Convert.ToString(100),
            Line = Convert.ToString(100),
            LineRule = LineSpacingRuleValues.AtLeast
        });

        Run run = new Run();
        RunProperties runProperties = new RunProperties();

        Text text = new Text();

        if (!String.IsNullOrEmpty(paragraphText))
        {
            text.Text = paragraphText;
        }

        run.Append(runProperties);
        run.Append(text);

        paragraph.Append(paragraphProperties);
        paragraph.Append(run);

        return paragraph;
    }

}

• This is the most important part of this process, it is using Microsoft's internal OLE Server, creates the Binary DATA and Binary EMF information for a file. All you have to here is call the OpenXmlEmbeddedObject constructor and all get’s taken care of. It will mimic the process that goes on when you manually drag any file into Word; there is some kind of conversion that goes on when you do that, turning the file into an OLE object, so that Microsoft can recognize the file. o The most imporant parts of this class are the OleObjectBinaryData and OleImageBinaryData properties; they contain the 64Bit string binary info for the file data and ‘.emf’ image. o If you choose to not display the file as an icon, then the ‘.emf’ image data will create a snapshot of the file, like the first page of the pdf file for example, in which you can still double-click to open o If you are embedding an image and choose not to display it as an Icon, then the OleObjectBinaryData and OleImageBinaryData properties will be the same

using System.Runtime.InteropServices;
using System.Xml;
using System.Diagnostics;
using System.IO;
using System.Drawing;
using Microsoft.Office.Interop.Word;

public class OpenXmlEmbeddedObject
{
    #region Constants

    private const string _defaultOleContentType = "application/vnd.openxmlformats-officedocument.oleObject";
    private const string _oleObjectDataTag = "application/vnd";
    private const string _oleImageDataTag = "image/x-emf";

    #endregion Constants

    #region Member Variables

    private static FileInfo _fileInfo;
    private static string _filePathAndName;
    private static bool _displayAsIcon;
    private static bool _objectIsPicture;

    private object _objectMissing = System.Reflection.Missing.Value;
    private object _objectFalse = false;
    private object _objectTrue = true;

    #endregion Member Variables

    #region Properties

    /// <summary>
    /// The File Type, as stored in Registry (Ex: a GIF Image = 'giffile')
    /// </summary>
    public string FileType
    {
        get
        {
            if (String.IsNullOrEmpty(_fileType) && _fileInfo != null)
            {
                _fileType = GetFileType(_fileInfo, false);
            }

            return _fileType;
        }
    }
    private string _fileType;

    /// <summary>
    /// The File Context Type, as storered in Registry (Ex: a GIF Image = 'image/gif')
    /// * Is converted into the 'Default Office Context Type' for non-office files
    /// </summary>
    public string FileContentType
    {
        get
        {
            if (String.IsNullOrEmpty(_fileContentType) && _fileInfo != null)
            {
                _fileContentType = GetFileContentType(_fileInfo);

                if (!_fileContentType.Contains("officedocument"))
                {
                    _fileContentType = _defaultOleContentType;
                }
            }

            return _fileContentType;
        }
    }
    private string _fileContentType;

    /// <summary>
    /// Gets the ContentType Text for the file
    /// </summary>
    public static string GetFileContentType(FileInfo fileInfo)
    {
        if (fileInfo == null)
        {
            throw new ArgumentNullException("fileInfo");
        }

        string mime = "application/octetstream";

        string ext = System.IO.Path.GetExtension(fileInfo.Name).ToLower(); 

        Microsoft.Win32.RegistryKey rk = Microsoft.Win32.Registry.ClassesRoot.OpenSubKey(ext);

        if (rk != null && rk.GetValue("Content Type") != null)
        {
            mime = rk.GetValue("Content Type").ToString();
        }

        return mime;
    }

    public bool ObjectIsOfficeDocument
    {
        get { return FileContentType != _defaultOleContentType; }
    }

    public bool ObjectIsPicture
    {
        get { return _objectIsPicture; }
    }

    public string OleObjectBinaryData
    {
        get { return _oleObjectBinaryData; }
        set { _oleObjectBinaryData = value; }
    }
    private string _oleObjectBinaryData;

    public string OleImageBinaryData
    {
        get { return _oleImageBinaryData; }
        set { _oleImageBinaryData = value; }
    }
    private string _oleImageBinaryData;

    /// <summary>
    /// The OpenXml information for the Word Application that is created (Make-Shoft Code Reflector)
    /// </summary>
    public string WordOpenXml
    {
        get { return _wordOpenXml; }
        set { _wordOpenXml = value; }
    }
    private String _wordOpenXml;

    /// <summary>
    /// The XmlDocument that is created based on the OpenXml Data from WordOpenXml
    /// </summary>
    public XmlDocument OpenXmlDocument
    {
        get
        {
            if (_openXmlDocument == null && !String.IsNullOrEmpty(WordOpenXml))
            {
                _openXmlDocument = new XmlDocument();
                _openXmlDocument.LoadXml(WordOpenXml);
            }

            return _openXmlDocument;
        }
    }
    private XmlDocument _openXmlDocument;

    /// <summary>
    /// The XmlNodeList, for all Nodes containing 'binaryData'
    /// </summary>
    public XmlNodeList BinaryDataXmlNodesList
    {
        get
        {
            if (_binaryDataXmlNodesList == null && OpenXmlDocument != null)
            {
                _binaryDataXmlNodesList = OpenXmlDocument.GetElementsByTagName("pkg:binaryData");
            }

            return _binaryDataXmlNodesList;
        }
    }
    private XmlNodeList _binaryDataXmlNodesList;

    /// <summary>
    /// Icon Object for the file
    /// </summary>
    public Icon ObjectIcon
    {
        get
        {
            if (_objectIcon == null)
            {
                _objectIcon = Enterprise.Windows.Win32.Win32.GetLargeIcon(_filePathAndName);
            }

            return _objectIcon;
        }
    }
    private Icon _objectIcon;

    /// <summary>
    /// File Name for the Icon being created
    /// </summary>
    public string ObjectIconFile
    {
        get
        {
            if (String.IsNullOrEmpty(_objectIconFile))
            {
                _objectIconFile = String.Format("{0}.ico", _filePathAndName.Replace(".", ""));
            }

            return _objectIconFile;
        }
    }
    private string _objectIconFile;

    /// <summary>
    /// Gets the original height and width of the emf file being created
    /// </summary>
    public string OleImageStyle
    {
        get
        {
            if (String.IsNullOrEmpty(_oleImageStyle) && !String.IsNullOrEmpty(WordOpenXml))
            {
                XmlNodeList xmlNodeList = OpenXmlDocument.GetElementsByTagName("v:shape");
                if (xmlNodeList != null && xmlNodeList.Count > 0)
                {
                    foreach (XmlAttribute attribute in xmlNodeList[0].Attributes)
                    {
                        if (attribute.Name == "style")
                        {
                            _oleImageStyle = attribute.Value;
                        }
                    }
                }
            }

            return _oleImageStyle;
        }

        set { _oleImageStyle = value; }
    }
    private string _oleImageStyle;

    #endregion Properties

    #region Constructor

    /// <summary>
    /// Generates binary information for the file being passed in
    /// </summary>
    /// <param name="fileInfo">The FileInfo object for the file to be embedded</param>
    /// <param name="displayAsIcon">Whether or not to display the file as an Icon (Otherwise it will show a snapshot view of the file)</param>
    public OpenXmlEmbeddedObject(FileInfo fileInfo, bool displayAsIcon)
    {
        _fileInfo = fileInfo;
        _filePathAndName = fileInfo.ToString();
        _displayAsIcon = displayAsIcon;

        SetupOleFileInformation();
    }

    #endregion Constructor

    #region Methods

    /// <summary>
    /// Creates a temporary Word App in order to add an OLE Object, get's the OpenXML data from the file (similar to the Code Reflector info)
    /// </summary>
    private void SetupOleFileInformation()
    {
        Microsoft.Office.Interop.Word.Application wordApplication = new Microsoft.Office.Interop.Word.Application();

        Microsoft.Office.Interop.Word.Document wordDocument = wordApplication.Documents.Add(ref _objectMissing, ref _objectMissing,
            ref _objectMissing, ref _objectMissing);

        object iconObjectFileName = _objectMissing;
        object objectClassType = FileType;
        object objectFilename = _fileInfo.ToString();

        Microsoft.Office.Interop.Word.InlineShape inlineShape = null;

        if (_displayAsIcon)
        {
            if (ObjectIcon != null)
            {
                using (FileStream iconStream = new FileStream(ObjectIconFile, FileMode.Create))
                {
                    ObjectIcon.Save(iconStream);
                    iconObjectFileName = ObjectIconFile;
                }
            }

            object objectIconLabel = _fileInfo.Name;

            inlineShape = wordDocument.InlineShapes.AddOLEObject(ref objectClassType,
                ref objectFilename, ref _objectFalse, ref _objectTrue, ref iconObjectFileName,
                ref _objectMissing, ref objectIconLabel, ref _objectMissing);
        }
        else
        {
            try
            {
                Image image = Image.FromFile(_fileInfo.ToString());
                _objectIsPicture = true;
                OleImageStyle = String.Format("height:{0}pt;width:{1}pt", image.Height, image.Width);

                wordDocument.InlineShapes.AddPicture(_fileInfo.ToString(), ref _objectMissing, ref _objectTrue, ref _objectMissing);
            }
            catch
            {
                inlineShape = wordDocument.InlineShapes.AddOLEObject(ref objectClassType,
                    ref objectFilename, ref _objectFalse, ref _objectFalse, ref _objectMissing, ref _objectMissing,
                    ref _objectMissing, ref _objectMissing);
            }
        }

        WordOpenXml = wordDocument.Range(ref _objectMissing, ref _objectMissing).WordOpenXML;

        if (_objectIsPicture)
        {
            OleObjectBinaryData = GetPictureBinaryData();
            OleImageBinaryData = GetPictureBinaryData();
        }
        else
        {
            OleObjectBinaryData = GetOleBinaryData(_oleObjectDataTag);
            OleImageBinaryData = GetOleBinaryData(_oleImageDataTag);
        }

        // Not sure why, but Excel seems to hang in the processes if you attach an Excel file…
        // This kills the excel process that has been started < 15 seconds ago (so not to kill the user's other Excel processes that may be open)
        if (FileType.StartsWith("Excel"))
        {
            Process[] processes = Process.GetProcessesByName("EXCEL");
            foreach (Process process in processes)
            {
                if (DateTime.Now.Subtract(process.StartTime).Seconds <= 15)
                {
                    process.Kill();
                    break;
                }
            }
        }

        wordDocument.Close(ref _objectFalse, ref _objectMissing, ref _objectMissing);
        wordApplication.Quit(ref _objectMissing, ref _objectMissing, ref _objectMissing);
    }

    /// <summary>
    /// Gets the binary data from the Xml File that is associated with the Tag passed in
    /// </summary>
    /// <param name="binaryDataXmlTag">the Tag to look for in the OpenXml</param>
    /// <returns></returns>
    private string GetOleBinaryData(string binaryDataXmlTag)
    {
        string binaryData = null;
        if (BinaryDataXmlNodesList != null)
        {
            foreach (XmlNode xmlNode in BinaryDataXmlNodesList)
            {
                if (xmlNode.ParentNode != null)
                {
                    foreach (XmlAttribute attr in xmlNode.ParentNode.Attributes)
                    {
                        if (String.IsNullOrEmpty(binaryData) && attr.Value.Contains(binaryDataXmlTag))
                        {
                            binaryData = xmlNode.InnerText;
                            break;
                        }
                    }
                }
            }
        }

        return binaryData;
    }

    /// <summary>
    /// Gets the image Binary data, if the file is an image
    /// </summary>
    /// <returns></returns>
    private string GetPictureBinaryData()
    {
        string binaryData = null;
        if (BinaryDataXmlNodesList != null)
        {
            foreach (XmlNode xmlNode in BinaryDataXmlNodesList)
            {
                binaryData = xmlNode.InnerText;
                break;
            }
        }

        return binaryData;
    }

    /// <summary>
    /// Gets the file type description ("Application", "Text Document", etc.) for the file.
    /// </summary>
    /// <param name="fileInfo">FileInfo containing extention</param>
    /// <returns>Type Description</returns>
    public static string GetFileType(FileInfo fileInfo, bool returnDescription)
    {
        if (fileInfo == null)
        {
            throw new ArgumentNullException("fileInfo");
        }

        string description = "File";
        if (string.IsNullOrEmpty(fileInfo.Extension))
        {
            return description;
        }
        description = string.Format("{0} File", fileInfo.Extension.Substring(1).ToUpper());
        RegistryKey typeKey = Registry.ClassesRoot.OpenSubKey(fileInfo.Extension);
        if (typeKey == null)
        {
            return description;
        }
        string type = Convert.ToString(typeKey.GetValue(string.Empty));
        RegistryKey key = Registry.ClassesRoot.OpenSubKey(type);
        if (key == null)
        {
            return description;
        }

        if (returnDescription)
        {
            description = Convert.ToString(key.GetValue(string.Empty));
            return description;
        }
        else
        {
            return type;
        }
    }

    #endregion Methods
}
like image 190
D Lyonnais Avatar answered Nov 08 '22 12:11

D Lyonnais


 _objectIcon = Enterprise.Windows.Win32.Win32.GetLargeIcon(_filePathAndName); 

seems to be broken, but

_objectIcon = System.Drawing.Icon.ExtractAssociatedIcon(_filePathAndName);

should also work.

like image 1
Markus Avatar answered Nov 08 '22 11:11

Markus