Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid header reading xls file

I am reading one excel file on my local system. I am using POI jar Version 3.7, but getting error Invalid header signature; read -2300849302551019537 or in Hex 0xE011BDBFEFBDBFEF , expected -2226271756974174256 or in Hex 0xE11AB1A1E011CFD0.

Opening the xls file with Excel works fine.

The codeblock where it happens: Anybody an idea ?

/**
 * create a new HeaderBlockReader from an InputStream
 *
 * @param stream the source InputStream
 *
 * @exception IOException on errors or bad data
 */
public HeaderBlockReader(InputStream stream) throws IOException {
    // At this point, we don't know how big our
    //  block sizes are
    // So, read the first 32 bytes to check, then
    //  read the rest of the block
    byte[] blockStart = new byte[32];
    int bsCount = IOUtils.readFully(stream, blockStart);
    if(bsCount != 32) {
        throw alertShortRead(bsCount, 32);
    }

    // verify signature
    long signature = LittleEndian.getLong(blockStart, _signature_offset);

    if (signature != _signature) {
        // Is it one of the usual suspects?
        byte[] OOXML_FILE_HEADER = POIFSConstants.OOXML_FILE_HEADER;
        if(blockStart[0] == OOXML_FILE_HEADER[0] &&
            blockStart[1] == OOXML_FILE_HEADER[1] &&
            blockStart[2] == OOXML_FILE_HEADER[2] &&
            blockStart[3] == OOXML_FILE_HEADER[3]) {
            throw new OfficeXmlFileException("The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)");
        }
        if ((signature & 0xFF8FFFFFFFFFFFFFL) == 0x0010000200040009L) {
            // BIFF2 raw stream starts with BOF (sid=0x0009, size=0x0004, data=0x00t0)
            throw new IllegalArgumentException("The supplied data appears to be in BIFF2 format.  "
                    + "POI only supports BIFF8 format");
        }

        // Give a generic error
        throw new IOException("Invalid header signature; read "
                              + longToHex(signature) + ", expected "
                              + longToHex(_signature));
    }
like image 949
dutchman79 Avatar asked Dec 19 '12 10:12

dutchman79


2 Answers

Just an idee, if you using maven make sure in the resource tag filtering is set to false. Otherwise maven tends to corrupt xls files in the copying phase

like image 111
Felix Avatar answered Oct 10 '22 05:10

Felix


That exception is telling you that your file isn't a valid OLE2-based .xls file.

Being able to open the file in Excel is no real guide - Excel will happily open any file it knows about no matter what the extension is on it. If you take a .csv file and rename it to .xls, Excel will still open it, but the renaming hasn't magically made it be in the .xls format so POI won't open it for you.

If you open the file in Excel and do Save-As, it'll let you write it out as a real Excel file. If you want to know what file it really is, try using Apache Tika - the Tika CLI with --detect ought to be able to tell you

.

How can I be sure it's not a valid file? If you look at the OLE2 file format specification doc from Microsoft, and head to section 2.2 you'll see the following:

Header Signature (8 bytes): Identification signature for the compound file structure, and MUST be set to the value 0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1.

Flip those bytes round (OLE2 is little endian) and you get 0xE11AB1A1E011CFD0, the magic number from the exception. Your file doesn't start with that magic number, as so really isn't a valid OLE2 document, and hence POI gives you that exception.

like image 27
Gagravarr Avatar answered Oct 10 '22 05:10

Gagravarr