Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filehelpers ExcelStorage.ExtractRecords fails when first cell is empty

When the first cell of an excel sheet to import using ExcelStorage.ExtractRecords is empty, the process fail. Ie. If the data starts at col 1, row 2, if the cell (2,1) has an empty value, the method fails.

Does anybody know how to work-around this? I've tried adding a FieldNullValue attribute to the mapping class with no luck.

Here is a sample project that show the code with problems

Hope somebody can help me or point in some direction.

Thank you!

like image 560
Sebastian Avatar asked Sep 22 '09 23:09

Sebastian


3 Answers

It looks like you have stumbled upon an issue in FileHelpers.

What is happening is that the ExcelStorage.ExtractRecords method uses an empty cell check to see if it has reached the end of the sheet. This can be seen in the ExcelStorage.cs source code:

while (CellAsString(cRow, mStartColumn) != String.Empty)
{
    try
    {
        recordNumber++;
        Notify(mNotifyHandler, mProgressMode, recordNumber, -1);

        colValues = RowValues(cRow, mStartColumn, RecordFieldCount);

        object record = ValuesToRecord(colValues);
        res.Add(record);

    }
    catch (Exception ex)
    {
        // Code removed for this example
    }
}


So if the start column of any row is empty then it assumes that the file is done.

Some options to get around this:

  1. Don't put any empty cells in the first column position.
  2. Don't use excel as your file format -- convert to CSV first.
  3. See if you can get a patch from the developer or patch the source yourself.

The first two are workarounds (and not really good ones). The third option might be the best but what is the end of file condition? Probably an entire row that is empty would be a good enough check (but even that might not work in all cases all of the time).

like image 155
Randy supports Monica Avatar answered Oct 20 '22 05:10

Randy supports Monica


Thanks to the help of Tuzo, I could figure out a way of working this around. I added a method to ExcelStorage class to change the while end condition. Instead of looking at the first cell for empty value, I look at all cells in the current row to be empty. If that's the case, return false to the while. This is the change to the while part of ExtractRecords:

while (!IsEof(cRow, mStartColumn, RecordFieldCount))

instead of

while (CellAsString(cRow, mStartColumn) != String.Empty)

IsEof is a method to check the whole row to be empty:

    private bool IsEof(int row, int startCol, int numberOfCols)
    {
        bool isEmpty = true;
        string cellValue = string.Empty;

        for (int i = startCol; i <= numberOfCols; i++)
        {
            cellValue = CellAsString(row, i);
            if (cellValue != string.Empty)
            {
                isEmpty = false;
                break;
            }
        }

        return isEmpty;
    }

Of course if the user leaves an empty row between two data rows the rows after that one will not be processed, but I think is a good thing to keep working on this.

Thanks

like image 26
Sebastian Avatar answered Oct 20 '22 05:10

Sebastian


I needed to be able to skip blank lines, so I've added the following code to the FileHelpers library. I've taken Sebastian's IsEof code and renamed the method to IsRowEmpty and changed the loop in ExtractRecords from ...

while (CellAsString(cRow, mStartColumn) != String.Empty)

to ...

while (!IsRowEmpty(cRow, mStartColumn, RecordFieldCount) || !IsRowEmpty(cRow+1, mStartColumn, RecordFieldCount))

I then changed this ...

colValues = RowValues(cRow, mStartColumn, RecordFieldCount);

object record = ValuesToRecord(colValues);
res.Add(record);

to this ...

bool addRow = true;

if (Attribute.GetCustomAttribute(RecordType, typeof(IgnoreEmptyLinesAttribute)) != null && IsRowEmpty(cRow, mStartColumn, RecordFieldCount))
{
    addRow = false;
}

if (addRow)
{
    colValues = RowValues(cRow, mStartColumn, RecordFieldCount);

    object record = ValuesToRecord(colValues);
    res.Add(record);
}

What this gives me is the ability to skip single empty rows. The file will be read until two successive empty rows are found

like image 31
Antony Scott Avatar answered Oct 20 '22 03:10

Antony Scott