Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Streams – How to group by value and find min and max value of each group?

Tags:

For my example, having car object and found that min and max price value based on model (group by).

List<Car> carsDetails = UserDB.getCarsDetails();
Map<String, DoubleSummaryStatistics> collect4 = carsDetails.stream()
                .collect(Collectors.groupingBy(Car::getMake, Collectors.summarizingDouble(Car::getPrice)));
collect4.entrySet().forEach(e->System.out.println(e.getKey()+" "+e.getValue().getMax()+" "+e.getValue().getMin()));

output :
Lexus 94837.79 17569.59
Subaru 96583.25 8498.41
Chevrolet 99892.59 6861.85

But I couldn't find which car objects have max and min price. How can I do that?

like image 672
Learn Hadoop Avatar asked Jul 17 '18 09:07

Learn Hadoop


People also ask

How do you use max and min in stream?

min method we get the minimum element of this stream for the given comparator. Using Stream. max method we get the maximum element of this stream for the given comparator. The min and max method both are stream terminal operations.

How do you find the max number of a list in a stream?

Calling stream() method on the list to get a stream of values from the list. Calling mapToInt(value -> value) on the stream to get an Integer Stream. Calling max() method on the stream to get the max value. Calling orElseThrow() to throw an exception if no value is received from max()

How do you find the max element in an array using stream?

2. Using Stream. max() method. The idea is to convert the list into a Stream and call Stream#max() that accepts a Comparator to compare items in the stream against each other to find the maximum element, and returns an Optional containing the maximum element in the stream according to the provided Comparator .


2 Answers

If you were interested in only one Car per group, you could use, e.g.

Map<String, Car> mostExpensives = carsDetails.stream()
    .collect(Collectors.toMap(Car::getMake, Function.identity(),
        BinaryOperator.maxBy(Comparator.comparing(Car::getPrice))));
mostExpensives.forEach((make,car) -> System.out.println(make+" "+car));

But since you want the most expensive and the cheapest, you need something like this:

Map<String, List<Car>> mostExpensivesAndCheapest = carsDetails.stream()
    .collect(Collectors.toMap(Car::getMake, car -> Arrays.asList(car, car),
        (l1,l2) -> Arrays.asList(
            (l1.get(0).getPrice()>l2.get(0).getPrice()? l2: l1).get(0),
            (l1.get(1).getPrice()<l2.get(1).getPrice()? l2: l1).get(1))));
mostExpensivesAndCheapest.forEach((make,cars) -> System.out.println(make
        +" cheapest: "+cars.get(0)+" most expensive: "+cars.get(1)));

This solution bears a bit of inconvenience due to the fact that there is no generic statistics object equivalent to DoubleSummaryStatistics. If this happens more than once, it’s worth filling the gap with a class like this:

/**
 * Like {@code DoubleSummaryStatistics}, {@code IntSummaryStatistics}, and
 * {@code LongSummaryStatistics}, but for an arbitrary type {@code T}.
 */
public class SummaryStatistics<T> implements Consumer<T> {
    /**
     * Collect to a {@code SummaryStatistics} for natural order.
     */
    public static <T extends Comparable<? super T>> Collector<T,?,SummaryStatistics<T>>
                  statistics() {
        return statistics(Comparator.<T>naturalOrder());
    }
    /**
     * Collect to a {@code SummaryStatistics} using the specified comparator.
     */
    public static <T> Collector<T,?,SummaryStatistics<T>>
                  statistics(Comparator<T> comparator) {
        Objects.requireNonNull(comparator);
        return Collector.of(() -> new SummaryStatistics<>(comparator),
            SummaryStatistics::accept, SummaryStatistics::merge);
    }
    private final Comparator<T> c;
    private T min, max;
    private long count;
    public SummaryStatistics(Comparator<T> comparator) {
        c = Objects.requireNonNull(comparator);
    }

    public void accept(T t) {
        if(count == 0) {
            count = 1;
            min = t;
            max = t;
        }
        else {
            if(c.compare(min, t) > 0) min = t;
            if(c.compare(max, t) < 0) max = t;
            count++;
        }
    }
    public SummaryStatistics<T> merge(SummaryStatistics<T> s) {
        if(s.count > 0) {
            if(count == 0) {
                count = s.count;
                min = s.min;
                max = s.max;
            }
            else {
                if(c.compare(min, s.min) > 0) min = s.min;
                if(c.compare(max, s.max) < 0) max = s.max;
                count += s.count;
            }
        }
        return this;
    }

    public long getCount() {
        return count;
    }

    public T getMin() {
        return min;
    }

    public T getMax() {
        return max;
    }

    @Override
    public String toString() {
        return count == 0? "empty": (count+" elements between "+min+" and "+max);
    }
}

After adding this to your code base, you may use it like

Map<String, SummaryStatistics<Car>> mostExpensives = carsDetails.stream()
    .collect(Collectors.groupingBy(Car::getMake,
        SummaryStatistics.statistics(Comparator.comparing(Car::getPrice))));
mostExpensives.forEach((make,cars) -> System.out.println(make+": "+cars));

If getPrice returns double, it may be more efficient to use Comparator.comparingDouble(Car::getPrice) instead of Comparator.comparing(Car::getPrice).

like image 148
Holger Avatar answered Sep 23 '22 19:09

Holger


Here is a very concise solution. It collects all Cars into a SortedSet and thus works without any additional classes.

Map<String, SortedSet<Car>> grouped = carDetails.stream()
        .collect(groupingBy(Car::getMake, toCollection(
                () -> new TreeSet<>(comparingDouble(Car::getPrice)))));

grouped.forEach((make, cars) -> System.out.println(make
        + " cheapest: " + cars.first()
        + " most expensive: " + cars.last()));

A possible downside is performance, as all Cars are collected, not just the current min and max. But unless the data set is very large, I don't think it will be noticeable.

like image 44
rolve Avatar answered Sep 23 '22 19:09

rolve