I have a .csv file full of data on over 500 companies. Each row in the file refers to a particular companies dataset. I need to parse this file and extrapolate data from each to call 4 different web services.
The first line of the .csv file contains the column names. I am trying to write a method that takes a string param and this relates to the column title found in the .csv file.
Based on this param, I want the method to parse the file using Java 8's Stream functionality and return a list of the data taken from the column title for each row/company.
I feel like I am making it more complicated than it needs to be but cannot think of a more efficient way to achieve my goal.
Any thoughts or ideas would be greatly appreciated.
Searching through stackoverflow I found the following post which is similar but not quite the same. Parsing a CSV file for a unique row using the new Java 8 Streams API
public static List<String> getData(String titleToSearchFor) throws IOException{
Path path = Paths.get("arbitoryPath");
int titleIndex;
String retrievedData = null;
List<String> listOfData = null;
if(Files.exists(path)){
try(Stream<String> lines = Files.lines(path)){
List<String> columns = lines
.findFirst()
.map((line) -> Arrays.asList(line.split(",")))
.get();
titleIndex = columns.indexOf(titleToSearchFor);
List<List<String>> values = lines
.skip(1)
.map(line -> Arrays.asList(line.split(",")))
.filter(list -> list.get(titleIndex) != null)
.collect(Collectors.toList());
String[] line = (String[]) values.stream().flatMap(l -> l.stream()).collect(Collectors.collectingAndThen(
Collectors.toList(),
list -> list.toArray()));
String value = line[titleIndex];
if(value != null && value.trim().length() > 0){
retrievedData = value;
}
listOfData.add(retrievedData);
}
}
return listOfTitles;
}
Thanks
Java 8 offers the possibility to create streams out of three primitive types: int, long and double. As Stream<T> is a generic interface, and there is no way to use primitives as a type parameter with generics, three new special interfaces were created: IntStream, LongStream, DoubleStream.
We can read a CSV file line by line using the readLine() method of BufferedReader class. Split each line on comma character to get the words of the line into an array. Now we can easily print the contents of the array by iterating over it or by using an appropriate index.
You should not reinvent the wheel and use a common csv parser library. For example you can just use Apache Commons CSV.
It will handle a lot of things for you and is much more readable. There is also OpenCSV, which is even more powerful and comes with annotations based mappings to data classes.
try (Reader reader = Files.newBufferedReader(Paths.get("file.csv"));
CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT
.withFirstRecordAsHeader()
) {
for (CSVRecord csvRecord : csvParser) {
// Access
String name = csvRecord.get("MyColumn");
// (..)
}
Edit: Anyway, if you really want to do it on your own, take a look at this example.
I managed to shorten your snippet a bit.
If I get you correctly, you need all values of a particular column. The name of that column is given.
The idea is the same, but I improved reading from the file (it reads once); removed code duplication (like line.split(",")
), unnecessary wraps in List
(Collectors.toList()
).
// read lines once
List<String[]> lines = lines(path).map(l -> l.split(","))
.collect(toList());
// find the title index
int titleIndex = lines.stream()
.findFirst()
.map(header -> asList(header).indexOf(titleToSearchFor))
.orElse(-1);
// collect needed values
return lines.stream()
.skip(1)
.map(row -> row[titleIndex])
.collect(toList());
I've got 2 tips not related to the issue:
1. You have hardcoded a URI, it's better to move the value to a constant or add a method param.
2. You could move the main part out of the if
clause if you checked the opposite condition !Files.exists(path)
and threw an exception.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With