Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenXML spreadsheet created in .NET won't open in iPad

I am trying to generate a spreadsheet in .NET which will be opened by my manager on his iPad when he's out of the office.

The spreadsheet opens fine on a Windows PC, but when trying to open on the iPad it says "An error occurred while reading the document" (so useful!)

By using the "Compare" feature on the OpenXML SDK Productivity tool with a document that does open on the iPad, and by doing some manual editing of the faulty document's XML files in notepad I have narrowed it down to the file xl/_rels/workbook.xml.rels which stores the relationships of the parts in the workbook.

This is the code I am using to generate the WorkbookPart and references

    WorkbookPart workbookPart1 = document.AddWorkbookPart();

    WorkbookStylesPart workbookStylesPart1 = workbookPart1.AddNewPart<WorkbookStylesPart>("rId3");
    ThemePart themePart1 = workbookPart1.AddNewPart<ThemePart>("rId2");
    WorksheetPart worksheetPart1 = workbookPart1.AddNewPart<WorksheetPart>("rId1");

My code generates the following output, which does not open on the iPad.

      <?xml version="1.0" encoding="utf-8" ?> 
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="/xl/styles.xml" Id="rId3" /> 
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="/xl/theme/theme.xml" Id="rId2" /> 
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="/xl/worksheets/sheet.xml" Id="rId1" /> 
      </Relationships>

If I change the value of the Target attributes to use a relative reference path, giving the following output, then it does open on the iPad.

      <?xml version="1.0" encoding="utf-8" ?> 
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" Id="rId3" /> 
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme.xml" Id="rId2" /> 
          <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet.xml" Id="rId1" /> 
      </Relationships>

So the question is:
How do I change my .NET code so that it outputs the second version of the XML, with relative paths.

All help gratefully received!

like image 541
Jen Avatar asked Jun 07 '12 09:06

Jen


3 Answers

I've spent a lot of time researching this and thought I'd share my results. It appears that OpenXML is doing two things. 1. The content_types.xml file is missing an entry for the workbook 2. The xl/_rels/workbook.xml.rels file is using a fullly relative path.

Excel itself opens the file fine but I've tried various apps on the iPad and they all fail. So I've had to manually fix the files myself using the following code. It assumes the entire content of the file is passed in as a stream and uses DotNetZip to open and manipulate. Hope this code helps others!

    private Stream ApplyOpenXmlFix(Stream input)
    {
        const string RELS_FILE = @"xl/_rels/workbook.xml.rels";
        const string RELATIONSHIP_ELEMENT = "Relationship";
        const string CONTENT_TYPE_FILE = @"[Content_Types].xml";
        const string XL_WORKBOOK_XML = "/xl/workbook.xml";
        const string TARGET_ATTRIBUTE = "Target";
        const string SUPERFLUOUS_PATH = "/xl/";
        const string OVERRIDE_ELEMENT = "Override";
        const string PARTNAME_ATTRIBUTE = "PartName";
        const string CONTENTTYPE_ATTRIBUTE = "ContentType";
        const string CONTENTTYPE_VALUE = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml";

        XNamespace contentTypesNamespace = "http://schemas.openxmlformats.org/package/2006/content-types";
        XNamespace relsNamespace = "http://schemas.openxmlformats.org/package/2006/relationships";
        XDocument xlDocument;
        MemoryStream memWriter;

        try
        {
            input.Seek(0, SeekOrigin.Begin);
            ZipFile zip = ZipFile.Read(input);

            //First we fix the workbook relations file
            var workbookRelations = zip.Entries.Where(e => e.FileName == RELS_FILE).Single();
            xlDocument = XDocument.Load(workbookRelations.OpenReader());

            //Remove the /xl/ relative path from all target attributes
            foreach (var relationship in xlDocument.Root.Elements(relsNamespace + RELATIONSHIP_ELEMENT))
            {
                var target = relationship.Attribute(TARGET_ATTRIBUTE);

                if (target != null && target.Value.StartsWith(SUPERFLUOUS_PATH))
                {
                    target.Value = target.Value.Substring(SUPERFLUOUS_PATH.Length);
                }
            }

            //Replace the content in the source zip file
            memWriter = new MemoryStream();
            xlDocument.Save(memWriter, SaveOptions.DisableFormatting);
            memWriter.Seek(0, SeekOrigin.Begin);
            zip.UpdateEntry(RELS_FILE, memWriter);

            //Now we fix the content types XML file
            var contentTypeEntry = zip.Entries.Where(e => e.FileName == CONTENT_TYPE_FILE).Single();
            xlDocument = XDocument.Load(contentTypeEntry.OpenReader());

            if (!xlDocument.Root.Elements().Any(e =>
                e.Name == contentTypesNamespace + OVERRIDE_ELEMENT &&
                e.Attribute(PARTNAME_ATTRIBUTE) != null &&
                e.Attribute(PARTNAME_ATTRIBUTE).Value == XL_WORKBOOK_XML))
            {
                //Add in the missing element
                var overrideElement = new XElement(
                    contentTypesNamespace + OVERRIDE_ELEMENT,
                    new XAttribute(PARTNAME_ATTRIBUTE, XL_WORKBOOK_XML),
                    new XAttribute(CONTENTTYPE_ATTRIBUTE, CONTENTTYPE_VALUE));

                xlDocument.Root.Add(overrideElement);

                //Replace the content
                memWriter = new MemoryStream();
                xlDocument.Save(memWriter, SaveOptions.DisableFormatting);
                memWriter.Seek(0, SeekOrigin.Begin);
                zip.UpdateEntry(CONTENT_TYPE_FILE, memWriter);
            }

            Stream output = new MemoryStream();

            //Save file
            zip.Save(output);

            return output;
        }
        catch
        {
            //Just in case it fails, return the original document
            return input;
        }
    }
like image 194
DavidG Avatar answered Nov 04 '22 07:11

DavidG


Comradsky's answer of sending a pdf is a good idea, but in case anybody needs to be able to resolve this, I have come up with a solution. I know this is a horrible hack but it works and I've spent hours trying to find a way to do it "legally" to no avail.

It involves opening the .rels file and directly editing the xml within the file after the document has been closed.

    public static void MakeRelativePaths(string filepath)
    {
        // Get the namespace strings
        const string documentRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
        const string relationshipSchema = "http://schemas.openxmlformats.org/package/2006/relationships";

        string documentUri = null;
        string documentDirectory = null;
        string documentName = null;

        Uri relDocUri = null;

        XName targetAttributeName = null;
        string targetValue = null;

        //  Open the package
        using (Package xlPackage = Package.Open(filepath, FileMode.Open, FileAccess.ReadWrite))
        {
            // Get the directory and filename of the main document part (e.g. /xl/workbook.xml).
            foreach (System.IO.Packaging.PackageRelationship relationship in xlPackage.GetRelationshipsByType(documentRelationshipType))
            {
                documentUri = relationship.TargetUri.ToString();

                documentName = System.IO.Path.GetFileName(documentUri);
                documentDirectory = documentUri.Substring(0, documentUri.Length - documentName.Length);

                //  There should only be document part in the package, but break out anyway.
                break;
            }

            // Load the relationship document
            relDocUri = new Uri(documentDirectory + "_rels/" + documentName + ".rels", UriKind.Relative);
            XDocument relDoc = XDocument.Load(xlPackage.GetPart(relDocUri).GetStream());

            // Loop through all of the relationship nodes
            targetAttributeName = XName.Get("Target");
            foreach (XElement relNode in relDoc.Elements(XName.Get("Relationships", relationshipSchema)).Elements(XName.Get("Relationship", relationshipSchema)))
            {
                // Edit the value of the Target attribute
                targetValue = relNode.Attribute(targetAttributeName).Value;

                if (targetValue.StartsWith(documentDirectory))
                    targetValue = targetValue.Substring(documentDirectory.Length);

                relNode.Attribute(targetAttributeName).Value = targetValue;
            }

            // Save the document
            relDoc.Save(xlPackage.GetPart(relDocUri).GetStream());
        }
    }
like image 22
Jen Avatar answered Nov 04 '22 07:11

Jen


I have also been struggling with a problem similar to this for a while. I finally came up with a solution that works. This is the code I wrote to fix the problem

        // Add a new worksheet part to the workbook.
        WorksheetPart newWorksheetPart = _document.WorkbookPart.AddNewPart<WorksheetPart>();
        newWorksheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(new SheetData());

        Sheets sheets = _document.WorkbookPart.Workbook.GetFirstChild<Sheets>();
        string relationshipId = _document.WorkbookPart.GetIdOfPart(newWorksheetPart);

        //This bit is required for iPad to be able to read the sheets inside the xlsx file. The file will still work fine in Excel
        string relationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet";
        _document.Package.GetPart(_document.WorkbookPart.Uri).CreateRelationship(new Uri(newWorksheetPart.Uri.OriginalString.Replace("/xl/", String.Empty).Trim(), UriKind.Relative), TargetMode.Internal, relationshipType);
        _document.Package.GetPart(_document.WorkbookPart.Uri).DeleteRelationship(relationshipId);
        PackageRelationshipCollection sheetRelationships = _document.Package.GetPart(_document.WorkbookPart.Uri).GetRelationshipsByType(relationshipType);

        relationshipId = sheetRelationships.Where(f => f.TargetUri.OriginalString == newWorksheetPart.Uri.OriginalString.Replace("/xl/", String.Empty).Trim()).Single().Id;


        // Get a unique ID for the new sheet.
        uint sheetId = 1;
        if (sheets.Elements<Sheet>().Count() > 0)
            sheetId = sheets.Elements<Sheet>().Max(s => s.SheetId.Value) + 1;

        // Append the new worksheet and associate it with the workbook.
        Sheet sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = sheetName };
        sheets.Append(sheet);

        _worksheets.Add(new Worksheet(newWorksheetPart.Worksheet, sheetId));

_document and _worksheets, are private variables in my solution class.

like image 2
Burtie Arends Avatar answered Nov 04 '22 07:11

Burtie Arends