Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I skip white-space only lines using Super CSV?

Tags:

java

csv

supercsv

How do I configure Super CSV to skip blank or white-space only lines?

I'm using the CsvListReader and sometimes I'll get a blank line in my data. When this happens, an exception to the effect of:

number of CellProcessors must match number of fields

I'd like to simply skip these lines.

like image 921
dkantowitz Avatar asked Feb 18 '26 13:02

dkantowitz


1 Answers

Update: Super CSV 2.1.0 (released April 2013) allows you to supply a CommentMatcher via the preferences that will let you skip lines that are considered comments. There are 2 built in matchers you can use, or you can supply your own. In this case you could use new CommentMatches("\\s+") to skip blank lines.


Super CSV only skips lines of zero length (just a line terminator).

It's not a valid CSV file if there are blank lines (see rule 4 of RFC4180 which states that Each line should contain the same number of fields throughout the file). The only time a blank line is valid is if it's part of a multi-line field surrounded by quotes. e.g.

column1,column2
"multi-line field

with a blank line",value2

That being said, it might be possible to make Super CSV a bit more lenient with blank lines (it could ignore them). If you could post a feature request on our SourceForge page, we can investigate this further and potentially add this functionality in a future release.

That doesn't help you right now though!

I haven't done extensive testing on this, but it should work :) You can write your own tokenizer that skips blank lines:

package org.supercsv.io;

import java.io.IOException;
import java.io.Reader;
import java.util.List;

import org.supercsv.prefs.CsvPreference;

public class SkipBlankLinesTokenizer extends Tokenizer {

    public SkipBlankLinesTokenizer(Reader reader, CsvPreference preferences) {
        super(reader, preferences);
    }

    @Override
    public boolean readColumns(List<String> columns) throws IOException {

        boolean moreInput = super.readColumns(columns);

        // keep reading lines if they're blank
        while (moreInput && (columns.size() == 0 || 
                             columns.size() == 1 && 
                             columns.get(0).trim().isEmpty())){
            moreInput = super.readColumns(columns);
        }

        return moreInput;
    }

}

And just pass this into the constructor of your reader (you'll have to pass the preferences into both the reader and the tokenizer):

ICsvListReader listReader = null;
try {
    CsvPreference prefs = CsvPreference.STANDARD_PREFERENCE;
    listReader = new CsvListReader(
        new SkipBlankLinesTokenizer(new FileReader(CSV_FILENAME), prefs),
        prefs);
...

Hope this helps

like image 181
James Bassett Avatar answered Feb 20 '26 02:02

James Bassett



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!