Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java - How to load CSV in Map data structure with key and values as POJO - Map<ClassA, ClassB>

I have a CSV file with datapoints as

student, year, subject, score1, score2, score3, ..., score100
Alex, 2010, Math, 23, 56, 43, ..., 89
Alex, 2011, Science, 45, 32, 45, ..., 65
Matt, 2009, Art, 34, 56, 75, ..., 43
Matt, 2010, Math, 43, 54, 54, ..., 32

What would be the best way to load such CSV as Map in Java. This data is used for lookup service hence the chosen map data structure. The key would be the Tuple (student, year) -> which returns a list of subject + scores (SubjectScore.class). So the idea is given the name of the student and year, get all subjects and scores.

I didn't find an elegant solution while searching to read the CSV file in a Map of defined classes like Map<Tuple, List<SubjectScore>>

class Tuple {
  private String student;
  private int year;
}

class SubjectScore {
  private String subject;
  private int score1;
  private int score2;
  private int score3;
  // more fields here
  private int score100;
}

Additional details: The CSV file is large ~ 2 GB but is static in nature, hence deciding to load in memory.

like image 710
here_to_learn Avatar asked Jan 29 '26 20:01

here_to_learn


1 Answers

I was wondering how to take the same approach but convert it into Map<String, Map<Integer, List<SubjectScore>>>.

I have decided to add another answer because your needs regarding the data type have changed. Assuming you have still the same SubjectScore class

class SubjectScore {

    private String subject;
    private List<Integer> scores;

    public SubjectScore(String row) {
        String[] data = row.split(",");
        this.subject = data[0];
        this.scores = Arrays.stream(data, 1, data.length)
                .map(item -> Integer.parseInt(item.trim()))
                .collect(Collectors.toList());
    }
}

The old fashioned way with if-else blocks to check if a key-value pair alreday exists:

public static void main(String[] args) throws IOException {

    List<String> allLines = Files.readAllLines(Paths.get("path to your file"));

    Map<String,Map<String, List<SubjectScore>>> mapOldWay = new HashMap<>();

    for(String line : allLines.subList(1, allLines.size())){
        //split each line in 3 parts, i.e  1st column, 2nd column and everything after 3rd column
        String data[] = line.split("\\s*,\\s*",3);
        if(mapOldWay.containsKey(data[0])){
            if(mapOldWay.get(data[0]).containsKey(data[1])){
                mapOldWay.get(data[0]).get(data[1]).add(new SubjectScore(data[2]));
            }
            else{
                mapOldWay.get(data[0]).put(data[1], new ArrayList<>());
                mapOldWay.get(data[0]).get(data[1]).add(new SubjectScore(data[2]));
            }
        }
        else{
            mapOldWay.put(data[0], new HashMap<>());
            mapOldWay.get(data[0]).put(data[1], new ArrayList<>());
            mapOldWay.get(data[0]).get(data[1]).add(new SubjectScore(data[2]));
        }
    }

    printMap(mapOldWay);
}

public static void printMap(Map<String, Map<String, List<SubjectScore>>> map) {
    map.forEach((outerkey,outervalue) -> {
        System.out.println(outerkey);
        outervalue.forEach((innerkey,innervalue)-> {
            System.out.println("\t" + innerkey + " : " + innervalue);
        });
    });
}

Same logic but shorter using java 8 features (Map#computeIfAbsent):

public static void main(String[] args) throws IOException {

    List<String> allLines = Files.readAllLines(Paths.get("path to your file"));

    Map<String,Map<String, List<SubjectScore>>> mapJ8Features = new HashMap<>();
    for(String line : allLines.subList(1, allLines.size())){
        String data[] = line.split("\\s*,\\s*",3);
        mapJ8Features.computeIfAbsent(data[0], k -> new HashMap<>())
                .computeIfAbsent(data[1], k -> new ArrayList<>())
                .add(new SubjectScore(data[2]));
    }
}

Another approach using streams and nested Collectors#groupingBy

public static void main(String[] args) throws IOException {
    Map<String,Map<String, List<SubjectScore>>> mapStreams = new HashMap<>();        
    try (Stream<String> content = Files.lines(Paths.get("path to your file"))) {
        mapStreams = content.skip(1).map(line -> line.split("\\s*,\\s*",3))
                .collect(Collectors.groupingBy(splited -> splited[0],
                         Collectors.groupingBy(splited -> splited[1], 
                         Collectors.mapping(splited -> new SubjectScore(splited[2]),Collectors.toList()))));
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}

Note: I'm just now realizing that you wanted to represent the year as an Integer. I left it as string. If you want to change it just replace everywhere data[1] or splited[1] with Integer.parseInt(data[1] or splited[1])

like image 163
Eritrean Avatar answered Jan 31 '26 08:01

Eritrean



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!