I'd like to parse an Excel file with java, so I'm using apache poi libraries, here you are the maven dependencies:
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.14</version>
</dependency>
This will include a series of dependencies:
poi-ooxml-3.14.jar
poi-3.14.jar
commons-codec-1.10.jar
poi-ooxml-schemas-3.14.jar
xmlbeans-2.6.0.jar
stax-api-1.0.1.jar
curvesapi-1.03.jar
When I try to read an Office 365 Excel file (.xslx) with this code:
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ExcelConverter {
public static void main(String[] args) throws Exception{
String excelFilePath = "C:/temp/Book1.xlsx";
File myFile = new File(excelFilePath);
System.out.println("File exists: " + myFile.exists());
FileInputStream inputStream = new FileInputStream(myFile);
Workbook workbook = new XSSFWorkbook(inputStream);
}
}
I got the following console message:
File exists: true
Exception in thread "main" org.apache.poi.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699
at org.apache.poi.POIXMLDocumentPart.getPartFromOPCPackage(POIXMLDocumentPart.java:679)
at org.apache.poi.POIXMLDocumentPart.<init>(POIXMLDocumentPart.java:122)
at org.apache.poi.POIXMLDocumentPart.<init>(POIXMLDocumentPart.java:115)
at org.apache.poi.POIXMLDocument.<init>(POIXMLDocument.java:61)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:273)
at org.myCompany.excel.ExcelConverter.main(ExcelConverter.java:25)
Do you know what can I do to solve the issue? Thanks in advance
There doesn't currently appear to be any simple solution other than "Don't save your spreadsheet in "strict OOXML" format."
For example, in Excel use
Save As --> "Excel Workbook (.xlsx)"
instead of
Save As --> "Strict Open XML Spreadsheet (.xlsx)"
Do you know why Excel Worksheet and this format have the same file extension?
That would be something that only Microsoft can answer. But I guess that the engineers (or their management) did not anticipate that it would be necessary for application software to make the distinction.
I am accepting Files as input and then processing them based on the extension. How can I know without try-catch?
There is nothing that will let you process the document with current generation POI.
I guess you could code something to read the file and look for the signature for "strict OOXML" format1 before passing the file to POI, but there's not much point. You would be writing a stack of extra code just so that you can replace the try-catch with other logic.
1 - See https://www.loc.gov/preservation/digital/formats/fdd/fdd000395.shtml#sign
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With