Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Huge memory Allocation when using EPPlus Excel Library

Tags:

c#

excel

epplus

Context

I have been using EPPLUS as my tool to automate excel report generation, using C# as the client language of the library.

Problem:

After trying to write a really big report (response of a SQL Query), with pivot tables, charts and so forth, i end up having a Out of Memory Exception.

TroubleShooting

In order to troubleshoot, i decided to open an existing report that has 138MB, and use the GC object to try to take a peek on what's happening with my memory, and here are the results.

ExcelPackage pkg = new ExcelPackage (new FileInfo (@"PATH TO THE REPORT.xlsx"));
ExcelWorkbook wb = pkg.Workbook;

Garbage Collection Results, before the second line of code, and after.

The ammount of memory in use is too damn high

So, i have no idea what to do from now on. All i am doing is opening the report, which is consuming roughtly 10 (9.98 actually) times the report size itself, on memory.

The ~138MB of the excel file, takes up 1.370.817.264 bytes of RAM.

Update One:

There's a fairly recent beta version of EPPlus that's out that has on it's changelog:

New Cell store
* Less memory consumtion
* Insert columns (not on the range level)
* Faster row inserts

After updating the Nuget, i still have the same exception, that is thrown after the first line, instead of being raised on the second line.

like image 560
Marcello Grechi Lins Avatar asked Jul 03 '14 19:07

Marcello Grechi Lins


Video Answer


2 Answers

Modern Excel files, ie, Xlsx files are zip-compressed, and often achieve compression down to 10%. I just uncompressed a 1.6MB file I generated using a similar tool and found it extracted to 18.8 MB of data.

You've got a 0.138 GB file that is using 1.370 GB of memory, which is almost exactly 10%. The uncompressed representation in memory is what is eating your memory.

If you're curious, you can use a tool like 7-Zip to extract the Xlsx files, or you can rename the file to end in .zip and browse it in Windows.

like image 147
antiduh Avatar answered Sep 19 '22 21:09

antiduh


As I've encountered this too, and found no real solution, I've had to come up with the solution by myself. It comes as a new library: https://github.com/danielgindi/SpreadsheetStreams.net

This is based on taking a very old piece of code of mine, that supported csv and xml, refactor the interface, add xlsx support, and publish as a standalone library.

This is not a replacement for EPPlus or other spreadsheet manipulation libraries, this one is just about streaming generation of reports. Not all excel features are there also.

like image 45
daniel.gindi Avatar answered Sep 20 '22 21:09

daniel.gindi