Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse a CSV file that might have one of two delimiters?

In my case, valid CSV are ones delimited by either comma or semi-colon. I am open to other libraries, but it needs to be Java. Reading through the Apache CSVParser API, the only thing I can think is to do this which seems inefficient and ugly.

try
{
   BufferedReader reader = new BufferedReader(new InputStreamReader(file));
   CSVFormat csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(';');
   CSVParser parser = csvFormat.parse( reader );
   // now read the records
} 
catch (IOException eee) 
{
   try
   {
      // try the other valid delimeter
      csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(',');
      parser = csvFormat.parse( reader );
      // now read the records
   }
   catch (IOException eee) 
   {
      // then its really not a valid CSV file
   }
}

Is there a way to check the delimiter first, or perhaps allow two delimiters? Anyone have a better idea than just catching an exception?

like image 352
Coder1224 Avatar asked Aug 12 '15 00:08

Coder1224


People also ask

Can CSV have other delimiters?

A CSV file stores data in rows and the values in each row is separated with a separator, also known as a delimiter. Although the file is defined as Comma Separated Values, the delimiter could be anything. The most common delimiters are: a comma (,), a semicolon (;), a tab (\t), a space ( ) and a pipe (|).

Does a CSV file require delimiters?

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.

How do I use delimiter in a CSV file?

Adding "sep=;" or "sep=," to the CSV file When you have a CSV that is separated by semicolons (;) and your system/Excel default is commas (,), you can add a single line to tell Excel what delimiter to use when opening the file.


1 Answers

We built support for this in uniVocity-parsers:

public static void main(String... args) {
    CsvParserSettings settings = new CsvParserSettings();
    settings.setDelimiterDetectionEnabled(true);

    CsvParser parser = new CsvParser(settings);

    List<String[]> rows = parser.parseAll(file);

}

The parser has many more features that I'm sure you will find useful. Give it a try.

Disclaimer: I'm the author of this library, it's open source and free (apache 2.0 license)

like image 68
Jeronimo Backes Avatar answered Oct 05 '22 01:10

Jeronimo Backes