Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group by and sum operation on a String array

I have a String[] dataValues as below:

ONE:9
TWO:23
THREE:14
FOUR:132
ONE:255
TWO:727
FIVE:3
THREE:196
FOUR:1843
ONE:330
TWO:336
THREE:190
FOUR:3664

I want to total the values of ONE, TWO, THREE, FOUR, FIVE.

So I created a HashMap for the same:

Map<String, Integer> totals = new HashMap<String, Integer>();
for(String dataValue : dataValues){
    String[] keyVal = dataValue.split(":");
      totals.put(keyVal[0], totals.get(keyVal[0]).intValue() + Integer.parseInt(keyVal[1]));                            
}

But above code will obviously throw below exception if the key is not already existing in the map:

Exception in thread "main" java.lang.NullPointerException

What is the best way to get the totals in my usecase above?

like image 337
Vicky Avatar asked Mar 07 '26 14:03

Vicky


2 Answers

You can just get the value for the given key and checks if its not null:

for(String dataValue : dataValues){
    String[] keyVal = dataValue.split(":");
    Integer i = totals.get(keyVal[0]);
    if(i == null) {
        totals.put(keyVal[0], Integer.parseInt(keyVal[1]));
    } else {
        totals.put(keyVal[0], i + Integer.parseInt(keyVal[1]));
    }
 }

What is the best way to get the totals in my usecase above?

With Java 8 you can use the merge function

for(String dataValue : dataValues){
    String[] keyVal = dataValue.split(":");
    totals.merge(keyVal[0], Integer.parseInt(keyVal[1]), Integer::sum);
}

What this function does? Let's cite the doc:

If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null

So as you get it, if there is no value associated with the key, you just map it with the int value of keyVal[1]. If there is already one, you need to provide a function to decide what you will do with both values (the one that is already mapped and the one that you want to map).

In your case you want to sum them, so this function looks like (a, b) -> a + b, which can be replaced by the method reference Integer.sum because it's a function that takes two int and returns an int, so a valid candidate (and that have the semantic you need of course).


But wait, we can actually do better! This is where the Stream API and the collectors class come handy.

Get a Stream<String> from the file, split each line into an array, group each array by its first element (the key), map its second element (the values) to integer and sum them:

import static java.util.stream.Collectors.*;

...

Map<String, Integer> map = Files.lines(Paths.get("file"))
    .map(s -> s.split(":"))
    .collect(groupingBy(arr -> arr[0], summingInt(arr -> Integer.parseInt(arr[1])));

and another way would be to use the toMap collector.

.collect(toMap(arr -> arr[0], arr -> Integer.parseInt(arr[1]), Integer::sum));

From the same Stream<String[]>, you collect the results in a Map<String, Integer> from which the key is arr[0], the values are the int values hold by arr[1]. If you have the same keys you merge the values by summing them.

Both give the same result, I like the first one because with the name of the collector it makes the intent clear that you are grouping elements but it's up to you to choose.

Maybe a bit difficult to understand it at first, but it's very powerful once you grab the concept of these (downstream) collectors.

Hope it helps! :)

like image 139
Alexis C. Avatar answered Mar 10 '26 04:03

Alexis C.


Since Java 8 instead of map.get you can use map.getOrDefault which in case of lack of data will return default data defined by you like

totals.getOrDefault(keyVal[0], 0).intValue()
like image 29
Pshemo Avatar answered Mar 10 '26 03:03

Pshemo