I am attempting to write out an Excel file from an ASP.NET web server using OpenXML. I have about 2100 records and its taking around 20-30 seconds to do this. Any way I can make it faster? Retrieving the 2100 rows from the db takes a fraction of a second. Not sure why manipulating them in memory would take any longer.
Note: ExcelWriter is our custom class, but all its methods are directly from code in this link, http://msdn.microsoft.com/en-us/library/cc861607.aspx
public static MemoryStream CreateThingReport(List<Thing> things, MemoryStream template)
{
SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(template, true);
WorksheetPart workSheetPart = spreadsheet.WorkbookPart.WorksheetParts.First();
SharedStringTablePart sharedStringPart = spreadsheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();
Cell cell = null;
int index = 0;
//create cell formatting for header text
Alignment wrappedAlignment = new Alignment { WrapText = true };
uint rowOffset = 2;
foreach (Thing t in things)
{
//Received Date
cell = ExcelWriter.InsertCellIntoWorksheet("A", rowOffset, workSheetPart);
index = ExcelWriter.InsertSharedStringItem(t.CreateDate.ToShortDateString(), sharedStringPart);
cell.CellValue = new CellValue(index.ToString());
cell.DataType = new DocumentFormat.OpenXml.EnumValue<CellValues>(CellValues.SharedString);
//Car Part Name
cell = ExcelWriter.InsertCellIntoWorksheet("B", rowOffset, workSheetPart);
index = ExcelWriter.InsertSharedStringItem(t.CarPart.Name, sharedStringPart);
cell.CellValue = new CellValue(index.ToString());
cell.DataType = new DocumentFormat.OpenXml.EnumValue<CellValues>(CellValues.SharedString);
rowOffset++;
}
workSheetPart.Worksheet.Save();
spreadsheet.WorkbookPart.Workbook.Save();
spreadsheet.Close();
return template;
The big improment is more Save() function out of loop
//Save data
shareStringPart.SharedStringTable.Save();
worksheetPart.Worksheet.Save();
For 500 records, for me it change from 10 mins to 1 min.
So it looks like someone in the MSDN community docs ran into similar performance implications. The code below is very inefficient. Someone recommended using a hash table.
For our solution we just removed the insertion of shared strings altogether and went from 1:03 seconds to 0:03 seconds in download time.
//Old: (1:03)
cell = ExcelWriter.InsertCellIntoWorksheet("A", rowOffset, workSheetPart);
index = ExcelWriter.InsertSharedStringItem(thing.CreateDate.ToShortDateString(), sharedStringPart);
cell.CellValue = new CellValue(index.ToString());
cell.DataType = new DocumentFormat.OpenXml.EnumValue<CellValues>(CellValues.SharedString);
//New: (0:03)
cell = ExcelWriter.InsertCellIntoWorksheet("A", rowOffset, workSheetPart);
cell.CellValue = new CellValue(thing.CreateDate.ToShortDateString());
cell.DataType = new DocumentFormat.OpenXml.EnumValue<CellValues>(CellValues.String);
MSDN Docs (slow solution, they should use a Hash Table instead)
private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)
{
// If the part does not contain a SharedStringTable, create one.
if (shareStringPart.SharedStringTable == null)
{
shareStringPart.SharedStringTable = new SharedStringTable();
}
int i = 0;
// Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())
{
if (item.InnerText == text)
{
return i;
}
i++;
}
// The text does not exist in the part. Create the SharedStringItem and return its index.
shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
shareStringPart.SharedStringTable.Save();
return i;
}
@The Internet
Note that String data type is actually for formulas, for text should use InlineString. See 17.18.11 ST_CellType (Cell Type):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With