I have a csv file with some data in this format:
id,first,last,city
1,john,doe,austin
2,jane,mary,seattle
As of now I'm reading in the csv using this code:
String path = "./data/data.csv";
Map<Integer, User> map = new HashMap<>();
Reader reader = Files.newBufferedReader(Paths.get(path));
try (CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT)) {
List<CSVRecord> csvRecords = csvParser.getRecords();
for(int i=0; i < csvRecords.size(); i++){
if(0<i){//skip over header
CSVRecord csvRecord = csvRecords.get(i);
User currentUser = new User(
Double.valueOf(csvRecord.get(0)).intValue(),
Double.valueOf(csvRecord.get(1)),
Double.valueOf(csvRecord.get(2)),
Double.valueOf(csvRecord.get(3))
);
map.put(currentUser.getId(), currentUser);
}
}
} catch (IOException e){
System.out.println(e);
}
which grab the correct values, but if the values were in a different order, say [city,last,id,first], it would be read incorrectly since the reading is hard coded with the order [id,first,last,city]. (the User object also must be created with the fields in the exact order of id,first,last,city)
I know that I can use the 'withHeader' option, but that also requires me to define the header column order in advance like so:
String header = "id,first,last,city";
CSVParser csvParser = new CSVParser(reader, CSVFormat.EXCEL.withHeader(header.split(",")));
I also know there is a built in function getHeaderNames() but that only gets the headers after I've already passed them in as a string (so hard coding again). So if I passed in the header string "last,first,id,city" it would return exactly that in a list.
Is there a way to combine these bits to read in the csv no matter what the column orders are and to define my 'User' object with fields passed in order (id,first,last,city)?
We need to tell the parser to process the header line for us. We specify that as part of the CSVFormat, so we'll create a custom format like this:
CSVFormat csvFormat = CSVFormat.RFC4180.withFirstRecordAsHeader();
Question code used DEFAULT, but this is based on RFC4180 instead. Comparing them side-by-side:
DEFAULT RFC4180 Comment
=================================== =========================== ========================
withDelimiter(',') withDelimiter(',') Same
withQuote('"') withQuote('"') Same
withRecordSeparator("\r\n") withRecordSeparator("\r\n") Same
withIgnoreEmptyLines(true) withIgnoreEmptyLines(false) Don't ignore blank lines
withAllowDuplicateHeaderNames(true) - Don't allow duplicates
=================================== =========================== ========================
withFirstRecordAsHeader() We need this
With that change, we can call get(String name) instead of get(int i):
User currentUser = new User(
Integer.parseInt(csvRecord.get("id")),
csvRecord.get("first"),
csvRecord.get("last"),
csvRecord.get("city")
);
Note that CSVParser implements Iterable<CSVRecord>, so we can use a for-each loop, which makes the code look like this:
String path = "./data/data.csv";
Map<Integer, User> map = new HashMap<>();
try (CSVParser csvParser = new CSVParser(Files.newBufferedReader(Paths.get(path)),
CSVFormat.RFC4180.withFirstRecordAsHeader())) {
for (CSVRecord csvRecord : csvParser) {
User currentUser = new User(
Integer.parseInt(csvRecord.get("id")),
csvRecord.get("first"),
csvRecord.get("last"),
csvRecord.get("city")
);
map.put(currentUser.getId(), currentUser);
}
}
That code correctly parses the file, even if the column order changes, e.g. to:
last,first,id,city
doe,john,1,austin
mary,jane,2,seattle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With