Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenXML SDK returning a number for CellValue instead of cells text

I am using the Open XML SDK to open an Excel xlsx file and I try to read the cellvalue on position A1 in each sheet. I use the following code:

using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(openFileDialog1.FileName, false)) {     var sheets = spreadsheetDocument.WorkbookPart.Workbook.Descendants<Sheet>();      foreach (Sheet sheet in sheets)     {         WorksheetPart worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(sheet.Id);         Worksheet worksheet = worksheetPart.Worksheet;          Cell cell = GetCell(worksheet, "A", 1);          Console.Writeline(cell.CellValue.Text);      } }  private static Cell GetCell(Worksheet worksheet, string columnName, uint rowIndex) {      Row row = GetRow(worksheet, rowIndex);       if (row == null)          return null;       return row.Elements<Cell>().Where(c => string.Compare                (c.CellReference.Value, columnName +                rowIndex, true) == 0).First(); }  // Given a worksheet and a row index, return the row. private static Row GetRow(Worksheet worksheet, uint rowIndex) {     return worksheet.GetFirstChild<SheetData>().           Elements<Row>().Where(r => r.RowIndex == rowIndex).First(); }  

The text in the first worksheet on position A1 is simply 'test' however, in my console I see the value '0' as cell.CellValue.Text

Does anyone have an idea to get the correct value of the cell?

like image 710
jwdehaan Avatar asked Feb 25 '11 08:02

jwdehaan


People also ask

What is DocumentFormat OpenXml?

The Open XML SDK provides tools for working with Office Word, Excel, and PowerPoint documents. It supports scenarios such as: - High-performance generation of word-processing documents, spreadsheets, and presentations. - Populating content in Word files from an XML data source.

What is the difference between OpenXml and ClosedXML?

ClosedXML is an open source library created based on OpenXmlSdk. It's more user friendly. ClosedXML is a . NET library for reading, manipulating and writing Excel 2007+ (.

What is Sharedstring in OpenXml?

The shared strings table is a separate part inside the package. Each workbook contains only one shared string table part that contains strings that can appear multiple times in one sheet or in multiple sheets.


2 Answers

All strings in an Excel worksheet are stored in a array like structure called the SharedStringTable. The goal of this table is to centralize all strings in an index based array and then if that string is used multiple times in the document to just reference the index in this array. That being said, the 0 you received when you got the text value of the A1 cell is the index into the SharedStringTable. To get the real value you can use this helper function:

public static SharedStringItem GetSharedStringItemById(WorkbookPart workbookPart, int id) {     return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id); } 

Then in your code call it like this to get the real value:

Cell cell = GetCell(worksheet, "A", 1);  string cellValue = string.Empty;  if (cell.DataType != null) {     if (cell.DataType == CellValues.SharedString)     {        int id = -1;         if (Int32.TryParse(cell.InnerText, out id))        {            SharedStringItem item = GetSharedStringItemById(workbookPart, id);             if (item.Text != null)            {                cellValue = item.Text.Text;            }            else if (item.InnerText != null)            {                cellValue = item.InnerText;            }            else if (item.InnerXml != null)            {                cellValue = item.InnerXml;            }        }     } } 
like image 125
amurra Avatar answered Sep 28 '22 04:09

amurra


Amurra's answer seems to go ninety percent of the way, but it may need some nuance.

1) The function "GetSharedStringItemById" returns a SharedStringItem, not a string, such that the calling code example will not work. To get the actual value as a string, I believe you need to ask for the SharedStringItem's InnerText property, as follows:

public static string GetSharedStringItemById(WorkbookPart workbookPart, int id) {     return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id).InnerText; } 

2) The function also (correctly) asks for an int as part of its signature, but the example code call supplies a string, cell.CellValue.Text. It's trivial to convert the string to an int, but needs to be done, as the code as written might to be confusing.

like image 35
Brent Avatar answered Sep 28 '22 03:09

Brent