Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JXL and Apache POI parsing excel with attached image objects

I have tried using JXL and Apache POI to load data from an excel file, up until now the JXL mechanism has worked fine. Even if I embed an images in the file.

I received a file from a source which wont parse, I get the following exception with POI

Exception in thread "main" org.apache.poi.hssf.record.RecordInputStream$LeftoverDataException: Initialisation of record 0xE2 left 2 bytes remaining still to be read.
     at org.apache.poi.hssf.record.RecordInputStream.hasNextRecord(RecordInputStream.java:124)
     at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:402)
     at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:277)
     at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:202)
     at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:184)
     at testXlsParsers.main(TestXlsParsers.java:19)

and with jxl I get an index out of bounds

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
     at java.util.ArrayList.RangeCheck(ArrayList.java:546)
     at java.util.ArrayList.get(ArrayList.java:321)
     at jxl.read.biff.WorkbookParser.getSheet(WorkbookParser.java:247)
     at ParserXLS.parse(ParserXLS.java:27)
     at ParserXLS.main(ParserXLS.java:46)

The file loads into excel but not into open office and the only thing I can see in the raw data is the Adobe XMP Core 4.1 related object... which seems to be the cause of the problem, if i remove the image it works fine... if i stick another jpg in its fine.

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.1-c036 46.277092, Fri Feb 23 2007 14:16:18        ">

Is there some way to ignore this? How would I go about parsing this file.

Thanks.

like image 241
Dean Avatar asked Nov 05 '22 19:11

Dean


1 Answers

One thing to try is using a newer version of Apache POI - bugs like this get fixed over time.

If the latest version of POI doesn't help (3.8 beta 2 as of writing), you should open a new bug in the POI bugzilla, and upload the problem file. The exception you're seeing is caused by POI believing that there should be a certain amount of data, and finding more. With the file it'll be possible to identify why that data's there, and work around it. (That is, assuming it hasn't already been fixed!)

like image 194
Gagravarr Avatar answered Nov 12 '22 15:11

Gagravarr