Apache CSV parser with duplicate column headers

Question

I need to process CSV files which have duplicate headers, each data is in three columns (min, max and avg), but the header is the same for each column. The first column is min, second is average, third is max.

Apache CSV parser throws :

java.lang.IllegalArgumentException: The header contains a duplicate name:

How can I configure the parser to accept duplicate headers ?

Matthias Wiehl · Accepted Answer

There is no pre-defined configuration parameter in CSVParser that controls whether duplicate column names are acceptable.

A look at the source code shows that the initializeHeader method creates a Map which will have column names as keys and column indices as values. If you want to use header mappings, the column names must be unique.

However, there is a solution:

Specify a CSVFormat that ignores the column names defined on the first row of the CSV file, and define your column names manually.

From the CSVFormat documentation:

Defining column names

To define the column names you want to use to access records, write:
CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3");
Calling withHeader(String...) let's you use the given names to address values in a CSVRecord, and assumes that your CSV source does not contain a first record that also defines column names. If it does, then you are overriding this metadata with your names and you should skip the first record by calling withSkipHeaderRecord(boolean) with true.

Miha Hribar · Answer

Can now configure CSVParser to allow duplicate headers.

CSVFormat csvFormat = CSVFormat.withAllowDuplicateHeaderNames()

Apache CSV parser with duplicate column headers

Tags:

csv

apache-commons

klonq

2 Answers

Matthias Wiehl

Miha Hribar

Recent Activity

Donate For Us

Apache CSV parser with duplicate column headers

Tags:

csv

apache-commons

klonq

2 Answers

Matthias Wiehl

Miha Hribar

Related questions

Recent Activity

Donate For Us