Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

org.apache.poi.POIXMLException Strict OOXML isn't currently supported, please see bug #57699

I'd like to parse an Excel file with java, so I'm using apache poi libraries, here you are the maven dependencies:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.14</version>
</dependency>

This will include a series of dependencies:

poi-ooxml-3.14.jar
poi-3.14.jar
commons-codec-1.10.jar
poi-ooxml-schemas-3.14.jar
xmlbeans-2.6.0.jar
stax-api-1.0.1.jar
curvesapi-1.03.jar

When I try to read an Office 365 Excel file (.xslx) with this code:

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ExcelConverter {

    public static void main(String[] args) throws Exception{
        String excelFilePath = "C:/temp/Book1.xlsx";
        File myFile = new File(excelFilePath);
        System.out.println("File exists: " + myFile.exists());
        FileInputStream inputStream = new FileInputStream(myFile);

        Workbook workbook = new XSSFWorkbook(inputStream);
   }
}

I got the following console message:

File exists: true
Exception in thread "main" org.apache.poi.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699
    at org.apache.poi.POIXMLDocumentPart.getPartFromOPCPackage(POIXMLDocumentPart.java:679)
    at org.apache.poi.POIXMLDocumentPart.<init>(POIXMLDocumentPart.java:122)
    at org.apache.poi.POIXMLDocumentPart.<init>(POIXMLDocumentPart.java:115)
    at org.apache.poi.POIXMLDocument.<init>(POIXMLDocument.java:61)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:273)
    at org.myCompany.excel.ExcelConverter.main(ExcelConverter.java:25)

Do you know what can I do to solve the issue? Thanks in advance

like image 711
user1820620 Avatar asked Jun 10 '16 13:06

user1820620


1 Answers

There doesn't currently appear to be any simple solution other than "Don't save your spreadsheet in "strict OOXML" format."

For example, in Excel use

Save As --> "Excel Workbook (.xlsx)" 

instead of

Save As --> "Strict Open XML Spreadsheet (.xlsx)" 

Do you know why Excel Worksheet and this format have the same file extension?

That would be something that only Microsoft can answer. But I guess that the engineers (or their management) did not anticipate that it would be necessary for application software to make the distinction.

I am accepting Files as input and then processing them based on the extension. How can I know without try-catch?

There is nothing that will let you process the document with current generation POI.

I guess you could code something to read the file and look for the signature for "strict OOXML" format1 before passing the file to POI, but there's not much point. You would be writing a stack of extra code just so that you can replace the try-catch with other logic.


1 - See https://www.loc.gov/preservation/digital/formats/fdd/fdd000395.shtml#sign

like image 124
Stephen C Avatar answered Oct 16 '22 03:10

Stephen C