Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining multiple java streams in a structured way

I want to use Java's stream API to do some calculations on a list of objects:

List<Item>.stream()...

The Item class contains many attributes. For some of those I need to take the average value across all items in the collection, for other attributes I need to do other forms of calculations. I have been doing separate stream/collector calls to achieve this and although I'm not running into any performance issues (because the list size is usually about 100) I want to learn how to be more concise, aka loop once.

ItemCalculation itemCalculation = ItemCalculation.builder()
    .amountOfItems(itemList.size())
    .averagePrice(itemList.stream()
            .mapToDouble(item -> item.getPrice())
            .average()
            .getAsDouble())
    .averageInvestmentValue(itemList.stream()
            .mapToDouble(item -> getTotalInvestmentValue(item.getInvestmentValue(), item.getInvestmentValuePackaging()))
            .average()
            .getAsDouble())
    .highestWarrantyLimit(itemList.stream()... etc.

I read about creating a custom collector, but it seems a bit weird to have my "calculation" class be just one line (stream->customCollector) and then have a very bloated collector class that does the actual logic. Especially because different attributes are collected in a different way I would need many different intermediate count and other variables. Any thoughts?

like image 590
user1884155 Avatar asked Apr 24 '18 17:04

user1884155


1 Answers

Unfortunately, it doesn't seem possible to reasonably improve it using streams so it can perform better in a single-thread mode.

The code you provided in your question is clear for understanding and sufficiently performant for small collection as it is now.

If you'd like to boost the performance of your solution, you can iterate over your collection just once in an iterative manner, calculating every output you need in a single run:

    long amountOfItems = 0;
    double priseSum = 0;
    double highestWarrantyLimit = Double.MIN_VALUE;
    for (Item item : itemList) {
        amountOfItems++;
        priseSum += item.getPrice();
        double investmentValue = getTotalInvestmentValue(item.getInvestmentValue(), item.getInvestmentValuePackaging());
        if (highestWarrantyLimit < investmentValue) {
            highestWarrantyLimit = investmentValue;
        }
    }
    ItemCalculation itemCalculation = ItemCalculation.builder()
            .amountOfItems(amountOfItems)
            .averagePrice(priseSum / amountOfItems)
            .averageInvestmentValue(investmentValueSum / amountOfItems)
            .highestWarrantyLimit(highestWarrantyLimit)
            // ...
            .build(); 

The streams API was added to provide library support for processing sequences of data elements, which is very true for your case. However, streams impose a common pipeline for data elements, which is not true for your case and makes the pipeline to look like:

itemList.stream()
    .collect(toItemCalculation());

Which is not very reasonable, unless you're going to your it in the multi-threaded mode. In such case, a solution that utilizes a custom collector would be preferable since the scaffolding-code for combining logic is already built-in.

like image 55
Pavel Avatar answered Oct 21 '22 02:10

Pavel