Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenXML SDK 2.5 unreadable content

Tags:

c#

openxml-sdk

I am working on taking an existing Excel File which already has all of its formulas and formatting, I add data to the a sheet with a Table and when I then open that file in Excel I get the error

"Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded. Removed Records: Cell information from /xl/worksheets/sheet6.xml part"

I then open a manually created file with the same content and it works just fine. I also discovered the Open XML 2.5 Productivity Tool which when I run validation on the generated file it says no issues found.

When I run a compare on the two files I see the Generated file looks like this.

        <x:c r="B462" t="inlineStr">
            <x:is>
                <x:t>1150828</x:t>
            </x:is>
        </x:c>

While the Manually created file has Cells that look like this.

       <c s="80" r="B462">
         <v>
           1150828
         </v>
       </c>

Obviously there is a difference here but I don't know how to correct it nor do I know if this difference is the actual cause of the error. but seeing as how everything else seems to look the same I don't know what else it could be.

Oh and couple more things this file is not working but I am able to use another file that does not contain a Table, when I incorporate a table the issue occurs so I at least know that much.

Also if you're going to suggest that I use ClosedXML please don't. I have used it and it tends to leave off formatting at random for some reason that I cannot figure out hence why I have moved to OpenXML SDk

Here is some of the C# Code

 dt.Load(reader);
                            RowCount = dt.Rows.Count;
                            ColumnCount = dt.Columns.Count;

                                workbookPart = spreadDoc.WorkbookPart;
                                SheetDimension sheetDimension = new SheetDimension() { Reference = "A1:" + ColumnLetters[ColumnCount - 1] + (RowCount + 1) };

                                worksheetPart = Program.GetWorksheetPart(workbookPart, reportStep.ExcelSheetName);
                                worksheetPart.Worksheet.SheetDimension = sheetDimension;

                                SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
                                string relId = workbookPart.Workbook.Descendants<Sheet>().First(s => reportStep.ExcelSheetName.Equals(s.Name)).Id;

                                if (reportStep.ExcelTableExists)
                                {
                                    TableDefinitionPart tableDef = null;

                                    int looper = 0;
                                    foreach (WorksheetPart wsp in spreadDoc.WorkbookPart.WorksheetParts)
                                    {
                                        if (wsp.TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(reportStep.ExcelTableName)).Count() == 1)
                                        {
                                            tableDef = spreadDoc.WorkbookPart.WorksheetParts.ElementAt(looper).TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(reportStep.ExcelTableName)).FirstOrDefault();
                                            tableDef.Table.Reference.Value = "A1:" + (ColumnLetters[ColumnCount - 1] + (RowCount +1) ).ToString();
                                            tableDef.Table.AutoFilter.Reference.Value = "A1:" + (ColumnLetters[ColumnCount - 1] + (RowCount +1)).ToString();
                                           // tabledefinitionPart = Program.GetTablePart(wsp, reportStep.ExcelTableName, ColumnCount, RowCount);
                                        }
                                        looper++;
                                    }


                                }

                                sheetData = Chef.Program.ExportDataTable(dt, sheetData);
                                Sheet sheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == reportStep.ExcelSheetName);









public static TableDefinitionPart GetTablePart(WorksheetPart worksheet, string tablename, int columnCount, int rowCount)
    {
        uint CellRange = (uint)(columnCount);
        TableColumns tableColumns1 = new TableColumns() { Count = (UInt32Value)(CellRange) };

        var tableDefPart = worksheet.TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(tablename)).FirstOrDefault();
            //worksheet.WorksheetPart.TableDefinitionParts.AddNewPart<TableDefinitionPart>(tablename);
        var table = new Table() { HeaderRowCount = (uint)columnCount, Name = tablename, DisplayName = tablename, Reference = "A1:" + ColumnLetters[columnCount -1] + (rowCount + 1), TotalsRowShown = false };
        TableStyleInfo tableStyleInfo1 = new TableStyleInfo()
        {
            Name = "TableStyleMedium2",
            ShowFirstColumn = false,
            ShowLastColumn = false,
            ShowRowStripes = true,
            ShowColumnStripes = false
        };

        table.Append(tableStyleInfo1);
       // table.Append(tableColumns1);
        tableDefPart.Table = table;
        return tableDefPart;
    }

EDIT SECTION ADDING IN THE ADDITIONAL METHODS REQUESTED Updated 9/5/15

I did remove the code that added the header values since they are already a part of the base template of the excel file. also removed the specifying of the cell datatype to preserve what the template already had the cell datatype set to.

       public static SheetData ExportDataTable2(System.Data.DataTable exportData, SheetData sheetData)
    {
        //loop through each data row  
        DataRow contentRow;
        int startRow = 2;
        for (int i = 0; i < exportData.Rows.Count; i++)
        {
            contentRow = exportData.Rows[i];
            sheetData.AppendChild(createContentRow(contentRow, i + startRow));
        }

        return sheetData;
    }


    private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        Cell cell = new Cell();

       // cell.DataType = CellValues.Number;
        cell.CellReference = getColumnName(columnIndex) + rowIndex;
        cell.CellValue = new CellValue(cellValue.ToString());

        return cell;
    }

    private static Row createContentRow(DataRow dataRow, int rowIndex)
    {

        Row row = new Row
        {
            RowIndex = (UInt32)rowIndex
        };

        for (int i = 0; i < dataRow.Table.Columns.Count; i++)
        {
            Cell dataCell = createTextCell(i + 1, rowIndex, dataRow[i]);
          //  dataCell.DataType = CellValues.SharedString;
            row.AppendChild(dataCell);
        }

        return row;
    }
like image 246
scripter78 Avatar asked Aug 28 '15 18:08

scripter78


1 Answers

Well, it appears that you have used the following example OpenXML SDK 2.0: Export a DataTable to Excel as a base for your code. Here is the original code for creating a cell:

private Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
{
    Cell cell = new Cell();

    cell.DataType = CellValues.InlineString;
    cell.CellReference = getColumnName(columnIndex) + rowIndex;
    InlineString inlineString = new InlineString();
    Text t = new Text();

    t.Text = cellValue.ToString();
    inlineString.AppendChild(t);
    cell.AppendChild(inlineString);

    return cell;
}

Your original code was exactly the same except the following line:

cell.DataType = CellValues.String;

See the difference?

Then you have changed it to:

private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        Cell cell = new Cell();

       // cell.DataType = CellValues.Number;
        cell.CellReference = getColumnName(columnIndex) + rowIndex;
        cell.CellValue = new CellValue(cellValue.ToString());

        return cell;
    }

Ok, the problem is that you don't set cell.DataType correctly. It needs to be in sync with the cell content, otherwise you'll get such an errors from Excel. In the former case you set content to inline string, but data type to String. In the later - data type to Number (it doesn't matter that you have commented the line - Number is the default data type for cells) but the content is not always a number (the same function is used for the column headears - after all, it's called createTextCell).

In order to fix the problem, either use the original code from the example, or this code:

private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
{
    Cell cell = new Cell();
    cell.DataType = CellValues.String;
    cell.CellReference = getColumnName(columnIndex) + rowIndex;
    cell.CellValue = new CellValue(cellValue.ToString());
    return cell;
}

Finally, if you need to store a shared string, number, date etc., read the documentation and set the appropriate properties. I would aggree that the OpenXml API is not very intuitive, but this is what we have.

EDIT: Based on your comments, seems like your real problem is not exactly the one in question. Here is a high performance example of exporting DataTable with different data type columns:

public static class ExcelExporter
{
    public static void ExportDataTable(DataTable table, SheetData data)
    {
        var cellFactory = new CellFactory[table.Columns.Count];
        for (int i = 0; i < table.Columns.Count; i++)
            cellFactory[i] = GetCellFactory(table.Columns[i].DataType);
        int rowIndex = 0;
        data.AppendChild(CreateHeaderRow(rowIndex++, table));
        for (int i = 0; i < table.Rows.Count; i++)
            data.AppendChild(CreateContentRow(rowIndex++, table.Rows[i], cellFactory));
    }
    private static Row CreateHeaderRow(int rowIndex, DataTable table)
    {
        var row = CreateRow(rowIndex);
        for (int i = 0; i < table.Columns.Count; i++)
        {
            var cell = CreateTextCell(i, rowIndex, table.Columns[i].ColumnName);
            row.AppendChild(cell);
        }
        return row;
    }
    private static Row CreateContentRow(int rowIndex, DataRow dataRow, CellFactory[] cellFactory)
    {
        var row = CreateRow(rowIndex);
        for (int i = 0; i < dataRow.Table.Columns.Count; i++)
        {
            var cell = cellFactory[i](i, rowIndex, dataRow[i]);
            row.AppendChild(cell);
        }
        return row;
    }
    private static Row CreateRow(int index) { return new Row { RowIndex = (uint)index + 1 }; }
    private delegate Cell CellFactory(int columnIndex, int rowIndex, object cellValue);
    private static CellFactory GetCellFactory(Type dataType)
    {
        CellFactory factory;
        return CellFactoryMap.TryGetValue(dataType, out factory) ? factory : TextCellFactory;
    }
    private static readonly CellFactory TextCellFactory = CreateTextCell;
    private static readonly CellFactory DateCellFactory = CreateDateCell;
    private static readonly CellFactory NumericCellFactory = CreateNumericCell;
    private static readonly CellFactory BooleanCellFactory = CreateBooleanCell;
    private static readonly Dictionary<Type, CellFactory> CellFactoryMap = new Dictionary<Type, CellFactory>
    {
        { typeof(bool), BooleanCellFactory },
        { typeof(DateTime), DateCellFactory },
        { typeof(byte), NumericCellFactory },
        { typeof(sbyte), NumericCellFactory },
        { typeof(short), NumericCellFactory },
        { typeof(ushort), NumericCellFactory },
        { typeof(int), NumericCellFactory },
        { typeof(uint), NumericCellFactory },
        { typeof(long), NumericCellFactory },
        { typeof(ulong), NumericCellFactory },
        { typeof(float), NumericCellFactory },
        { typeof(double), NumericCellFactory },
        { typeof(decimal), NumericCellFactory },
    };
    private static Cell CreateTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        return CreateCell(CellValues.String, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateDateCell(int columnIndex, int rowIndex, object cellValue)
    {
        // NOTE: CellValues.Date is not supported in older Excel version.
        // In all Excel versions dates can be stored with CellValues.Number and a format style.
        // Since I have no styles, will export them just as text
        //var cell = CreateCell(CellValues.Number, columnIndex, rowIndex, ToExcelDate(cellValue));
        //cell.StyleIndex = ...;
        //return cell;
        return CreateCell(CellValues.String, columnIndex, rowIndex, 
            cellValue != null && cellValue != DBNull.Value ? ((DateTime)cellValue).ToShortDateString() : null);
    }
    private static Cell CreateNumericCell(int columnIndex, int rowIndex, object cellValue)
    {
        return CreateCell(CellValues.Number, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateBooleanCell(int columnIndex, int rowIndex, object cellValue)
    {
        // NOTE: CellValues.Boolean is not supported in older Excel version
        //return CreateCell(CellValues.Boolean, columnIndex, rowIndex, ToExcelValue(cellValue));
        return CreateCell(CellValues.String, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateCell(CellValues dataType, int columnIndex, int rowIndex, string cellValue)
    {
        var cell = new Cell();
        if (dataType != CellValues.Number) cell.DataType = dataType;
        cell.CellReference = GetColumnName(columnIndex) + (rowIndex + 1);
        cell.CellValue = new CellValue(cellValue ?? string.Empty);
        return cell;
    }
    private static string ToExcelValue(object value)
    {
        if (value == null || value == DBNull.Value) return null;
        return Convert.ToString(value, CultureInfo.InvariantCulture);
    }
    private static DateTime ExcelBaseDate = new DateTime(1900, 1, 1);
    private static string ToExcelDate(object value)
    {
        const int days29Feb1900 = 59;
        if (value == null || value == DBNull.Value) return null;
        var date = ((DateTime)value).Date;
        var days = (date - ExcelBaseDate).Days + 1;
        if (days >= days29Feb1900) days++;
        return days.ToString(CultureInfo.InvariantCulture);
    }
    private static string GetColumnName(int index) { return ColumnNameTable[index]; }
    private static readonly string[] ColumnNameTable = BuildColumnNameTable();
    private static string[] BuildColumnNameTable()
    {
        var table = new string[16384];
        var sb = new StringBuilder();
        for (int i = 0; i < table.Length; i++)
            table[i] = sb.BuildColumnName(i);
        return table;
    }
    private static string BuildColumnName(this StringBuilder sb, int index)
    {
        const int startLetter = 'A';
        const int letterCount = 'Z' - startLetter + 1;
        sb.Clear();
        while (true)
        {
            var letter = (char)(startLetter + (index % letterCount));
            sb.Insert(0, letter);
            if (index < letterCount) break;
            index = (index / letterCount) - 1;
        }
        return sb.ToString();
    }
}

The key point is, instead of checking the type of every value during the processing, prepare at the beginning a different create cell method for each column based on it's data type.

like image 95
Ivan Stoev Avatar answered Nov 15 '22 16:11

Ivan Stoev