Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

opencsv in java ignores backslash in a field value

Tags:

java

opencsv

I am reading a csv file using opencsv.

I am ignoring the first line of; the csv file is tab separated with some values enclosed in double quotes.

The problem occurs when I read the values of a column that has the '\' character, this is stripped out of the value.

reader = new CSVReader(new FileReader(exchFileObj),'\t','"',1);

For example in original file:

address = 12\91buenosaires   

It becomes as:

address = 1291buenosiares

In the string array that csvreader generates. How do I modify it to be able to read the '\' character also?

like image 890
ahaneo Avatar asked May 15 '11 12:05

ahaneo


4 Answers

I had the same problem and couldn't find another character I could guarantee wouldn't show up in my csv file. According to a post on sourceforge though, you can use the explicit constructor with a '\0' to indicate that you don't want any escape character.

http://sourceforge.net/tracker/?func=detail&aid=2983890&group_id=148905&atid=773542

CSVParser parser = new CSVParser(CSVParser.DEFAULT_SEPARATOR, CSVParser.DEFAULT_QUOTE_CHARACTER, '\0', CSVParser.DEFAULT_STRICT_QUOTES);

I did a bit of cursory testing, and this seems to work just fine, at least backslashes certainly make it through.

like image 129
JMM Avatar answered Nov 14 '22 13:11

JMM


CSVReader also has a parser builder via which you can set the escape character to use. If you use that and set the escape character to something you don't use you will get the backslash character in your input.

like image 21
rsp Avatar answered Nov 14 '22 13:11

rsp


In addition to @JMM 's answer, you have to use this created CSVParser in the constructor of the CSVReader. The only available constructor is:

public CSVReader(Reader reader, int line, CSVParser csvParser)

You can set the line to 0 so that it will not skip anything

like image 2
Wael Avatar answered Nov 14 '22 11:11

Wael


Note: I think the solution in this answer is better than the three alternatives in that it configures a compliant reader in a coarse-grained manner, by relying on the RFC. The other answers go into the details of configuring an escape character. While that works, it seems more like a white-box solution.

By default, OpenCSV's reader does not comply with the writer. The reader is not RFC-compliant. Don't ask me why that is, as I find it as troubling and perplexing as you.

The solution is for you to configure your CSVReader with an RFC-compliant parser:

RFC4180Parser rfc4180Parser = new RFC4180ParserBuilder().build();
CSVReaderBuilder csvReaderBuilder =
  new CSVReaderBuilder(new StringReader(writer.toString()))
      .withCSVParser(rfc4180Parser);
reader = csvReaderBuilder.build();

Here is the source page for the above.

like image 2
Mihai Danila Avatar answered Nov 14 '22 13:11

Mihai Danila