Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problematic corruption of .xlsx files with NPOI - Excel cannot open the file 'file.xlsx" because the file format or file extension is not valid

When reading or modifying some user-created .xlsx files, I get the following error message:

We found a problem with some content in 'test.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.

Clicking Yes gets me another message:

Excel cannot open the file 'test.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

Example of a problem .xlsx file here (before put in NPOI).

Here's the same file, now corrupted after being read from and written back with iWorkbook.Write(filestream); here.

I have no issues creating a new .xlsx file with the following code:

string newPath = @"C:\MyPath\test.xlsx";

using (FileStream fs = new FileStream(newPath, FileMode.Create, FileAccess.Write))
{
    IWorkbook wb = new XSSFWorkbook();
    wb.CreateSheet();
    ISheet s = wb.GetSheetAt(0);
    IRow r = s.CreateRow(0);
    r.CreateCell(0);
    ICell c = r.GetCell(0);
    c.SetCellValue("test");
    wb.Write(fs);
    fs.Close();
}

That works fine.

Even opening one of the problem child .xlsx files, setting it to an IWorkbook and writing it back to the file works:

string newPath = @"C:\MyPath\test.xlsx";

using (FileStream fs = new FileStream(newPath, FileMode.Open, FileAccess.ReadWrite))
{
    IWorkbook wb = new XSSFWorkbook(fs);
    wb.Write(fs);
    fs.Close();
}

However, after running through code that reads from it, gets ISheets, IRows, ICells, etc.... it corrupts the .xlsx file. Even though I specifically removed anything that modifies the workbook. No Creates, Sets, Styles, etc. with NPOI.

I can't really include my code because it would just be confusing, but for the sake of completeness I'm really only using the following types and functions from NPOI during this test:

IWorkbook
XSSFWorkbook
ISheet
IRow
ICell
.GetSheetAt
.GetRow
.GetCell
.LastRowNum

So one of those causes corruption. I would like to eventually set values again and get it working like I have for .xls.

Has anyone experienced this? What are some NPOI functions that could cause corruption? Any input would be appreciated.

Edit: Using NPOI v2.2.1.

like image 374
justiceorjustus Avatar asked Dec 12 '16 16:12

justiceorjustus


3 Answers

I think the problem is that you are reading from, and writing to, the same FileStream. You should be doing the read and write using separate streams. Try it like this:

string newPath = @"C:\MyPath\test.xlsx";

// read the workbook
IWorkbook wb;
using (FileStream fs = new FileStream(newPath, FileMode.Open, FileAccess.Read))
{
    wb = new XSSFWorkbook(fs);
}

// make changes
ISheet s = wb.GetSheetAt(0);
IRow r = s.GetRow(0) ?? s.CreateRow(0);
ICell c = r.GetCell(1) ?? r.CreateCell(1);
c.SetCellValue("test2");

// overwrite the workbook using a new stream
using (FileStream fs = new FileStream(newPath, FileMode.Create, FileAccess.Write))
{
    wb.Write(fs);
}
like image 66
Brian Rogers Avatar answered Oct 19 '22 06:10

Brian Rogers


I had the same problem. In my case the problem was not with the NPOI itself but with its dependency, SharpZipLib.

I used NPOI 2.3.0 and SharpZipLib 1.0.0. and it was given the the same error as in your case. The generated Excel was 0 bytes in size. I downgraded the SharpZipLib back to 0.86.0 in the project where I was using the NPOI (a Service layer) and also in the MVC project(I had the package of SharpZipLib here too).

I also removed manually in web.config the assembly dependency previously created for SharpZipLib:

<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
  .......
  <dependentAssembly>
    <assemblyIdentity name="ICSharpCode.SharpZipLib" publicKeyToken="1b03e6acf1164f73" culture="neutral" />
    <bindingRedirect oldVersion="0.0.0.0-1.0.0.999" newVersion="1.0.0.999" />
  </dependentAssembly>
</assemblyBinding>

I hope this helps someone.

like image 4
IonutC Avatar answered Oct 19 '22 07:10

IonutC


I had the same error attempting to write the excel file to a memory stream and then downloading through my .net Core controller.

This code was my problem (At this point, workbook contained the NPOI excel file I created):

var fileName = $"export.xlsx";
var mimeType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
MemoryStream stream = new();
workbook.Write(stream);
byte[] output = stream.GetBuffer();
return File(output, mimeType, fileName);

The issue was this line:

byte[] output = stream.GetBuffer();

That line gave me a byte array that contained the contents of my excel file, but I did not realize that the GetBuffer returned not only the byte array representing the excel file, but also the remaining allocated memory for the byte array.

I replaced that line with this:

byte[] output = stream.ToArray();

and life was good.

like image 2
birwin Avatar answered Oct 19 '22 07:10

birwin