Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading large XLSX files

I have an application that have to read excel and convert it to array. So far so good. Everything works file until I try to convert a larger file. I try OpenXML and try SAX approach:

using (SpreadsheetDocument xlsx = SpreadsheetDocument.Open(filePath, false))
{
   WorkbookPart workbookPart = xlsx.WorkbookPart;
   List<List<string>> parsedContent = new List<List<string>>();
   foreach (WorksheetPart worksheet in workbookPart.WorksheetParts)
       {
           OpenXmlReader xlsxReader = OpenXmlReader.Create(worksheet);

           while (xlsxReader.Read())
           {
           }
        }
 }

This is working well for files in range 1 - 10MB. My problem is when I try to load 10+ MB file. The result is OutOfMemoryException. How to proper read that big chunk of data? How to do it memory efficient?

P.s. I try libraries like ClosedXML, EPPlus and few others.

Every solution will be appreciated. Thank you in advance

like image 628
Vasil Indzhev Avatar asked Feb 06 '23 16:02

Vasil Indzhev


1 Answers

If you plan on only performing a read on the excel file content, I suggest you use the ExcelDataReader library instead Link, which extracts the worksheetData into a DataSet object.

        IExcelDataReader reader = null;
        string FilePath = "PathToExcelFile";

        //Load file into a stream
        FileStream stream = File.Open(FilePath, FileMode.Open, FileAccess.Read);

        //Must check file extension to adjust the reader to the excel file type
        if (Path.GetExtension(FilePath).Equals(".xls"))
            reader = ExcelReaderFactory.CreateBinaryReader(stream);
        else if (Path.GetExtension(FilePath).Equals(".xlsx"))
            reader = ExcelReaderFactory.CreateOpenXmlReader(stream);

        if (reader != null)
        {
            //Fill DataSet
            DataSet content = reader.AsDataSet();
            //Read....
        }
like image 151
Innat3 Avatar answered Feb 08 '23 15:02

Innat3