Folks,
Consider the following example, given a list of Trade objects my code needs to return an array containing trade volume for 24 hours, 7 days, 30 days and all times.
Using plain old iterator this requires only a single iteration over the collection.
I'm trying to do the same using a Java 8 streams and Lambda expressions. I came up with this code, which looks elegant, works fine, but requires 4 iterations over the list:
public static final int DAY = 24 * 60 * 60;
public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
double volume = trades.stream().mapToDouble(Trade::getVolume).sum();
double volume30d = trades.stream().filter(trade -> trade.getTimestamp() + 30 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
double volume7d = trades.stream().filter(trade -> trade.getTimestamp() + 7 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
double volume24h = trades.stream().filter(trade -> trade.getTimestamp() + DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
return new double[]{volume24h, volume7d, volume30d, volume};
}
How can I achieve the same using only a single iteration over the list ?
Cons on lambda functions:Lambda functions can have only one expression. Lambda functions cannot have a docstring. Many times lambda functions make code difficult to read.
Oracle claims that use of lambda expressions also improve the collection libraries making it easier to iterate through, filter, and extract data from a collection. In addition, new concurrency features improve performance in multicore environments.
This problem is similar to the "summary statistics" collector. Take a look at the IntSummaryStatistics
class:
public class IntSummaryStatistics implements IntConsumer {
private long count;
private long sum;
...
public void accept(int value) {
++count;
sum += value;
min = Math.min(min, value);
max = Math.max(max, value);
}
...
}
It is designed to work with collect()
; here's the implementation of IntStream.summaryStatistics()
public final IntSummaryStatistics summaryStatistics() {
return collect(IntSummaryStatistics::new, IntSummaryStatistics::accept,
IntSummaryStatistics::combine);
}
The benefit of writing a Collector
like this is then your custom aggregation can run in parallel.
Thanks Brian, I ended up implementing the code below, it's not as simple as I hoped but at least it iterates only once, its parallel ready and it passes my unit tests. Any improvements ideas are welcomed.
public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
TradeVolume tradeVolume = trades.stream().collect(
() -> new TradeVolume(timeStamp),
TradeVolume::accept,
TradeVolume::combine);
return tradeVolume.getVolume();
}
public static final int DAY = 24 * 60 * 60;
static class TradeVolume {
private int timeStamp;
private double[] volume = new double[4];
TradeVolume(int timeStamp) {
this.timeStamp = timeStamp;
}
public void accept(Trade trade) {
long tradeTime = trade.getTimestamp();
double tradeVolume = trade.getVolume();
volume[3] += tradeVolume;
if (!(tradeTime + 30 * DAY > timeStamp)) {
return;
}
volume[2] += tradeVolume;
if (!(tradeTime + 7 * DAY > timeStamp)) {
return;
}
volume[1] += tradeVolume;
if (!(tradeTime + DAY > timeStamp)) {
return;
}
volume[0] += tradeVolume;
}
public void combine(TradeVolume tradeVolume) {
volume[0] += tradeVolume.volume[0];
volume[1] += tradeVolume.volume[1];
volume[2] += tradeVolume.volume[2];
volume[3] += tradeVolume.volume[3];
}
public double[] getVolume() {
return volume;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With