I have a class like this:
class MultiDataPoint { private DateTime timestamp; private Map<String, Number> keyToData; }
and i want to produce , for each MultiDataPoint
class DataSet { public String key; List<DataPoint> dataPoints; } class DataPoint{ DateTime timeStamp; Number data; }
of course a 'key' can be the same across multiple MultiDataPoints.
So given a List<MultiDataPoint>
, how do I use Java 8 streams to convert to List<DataSet>
?
This is how I am currently doing the conversion without streams:
Collection<DataSet> convertMultiDataPointToDataSet(List<MultiDataPoint> multiDataPoints) { Map<String, DataSet> setMap = new HashMap<>(); multiDataPoints.forEach(pt -> { Map<String, Number> data = pt.getData(); data.entrySet().forEach(e -> { String seriesKey = e.getKey(); DataSet dataSet = setMap.get(seriesKey); if (dataSet == null) { dataSet = new DataSet(seriesKey); setMap.put(seriesKey, dataSet); } dataSet.dataPoints.add(new DataPoint(pt.getTimestamp(), e.getValue())); }); }); return setMap.values(); }
Instead, compose them into a single function, and then apply that single function to each element of the given collection. Then, in the calling code, you could create a function composed of many functions (you will have all the functions anyway) and invoke the multipleMapping() method with that function.
Java 8 Stream's map method is intermediate operation and consumes single element forom input Stream and produces single element to output Stream. It simply used to convert Stream of one type to another.
Converting only the Value of the Map<Key, Value> into Stream: This can be done with the help of Map. values() method which returns a Set view of the values contained in this map. In Java 8, this returned set can be easily converted into a Stream of key-value pairs using Set. stream() method.
To do this, I had to come up with an intermediate data structure:
class KeyDataPoint { String key; DateTime timestamp; Number data; // obvious constructor and getters }
With this in place, the approach is to "flatten" each MultiDataPoint into a list of (timestamp, key, data) triples and stream together all such triples from the list of MultiDataPoint.
Then, we apply a groupingBy
operation on the string key in order to gather the data for each key together. Note that a simple groupingBy
would result in a map from each string key to a list of the corresponding KeyDataPoint triples. We don't want the triples; we want DataPoint instances, which are (timestamp, data) pairs. To do this we apply a "downstream" collector of the groupingBy
which is a mapping
operation that constructs a new DataPoint by getting the right values from the KeyDataPoint triple. The downstream collector of the mapping
operation is simply toList
which collects the DataPoint objects of the same group into a list.
Now we have a Map<String, List<DataPoint>>
and we want to convert it to a collection of DataSet objects. We simply stream out the map entries and construct DataSet objects, collect them into a list, and return it.
The code ends up looking like this:
Collection<DataSet> convertMultiDataPointToDataSet(List<MultiDataPoint> multiDataPoints) { return multiDataPoints.stream() .flatMap(mdp -> mdp.getData().entrySet().stream() .map(e -> new KeyDataPoint(e.getKey(), mdp.getTimestamp(), e.getValue()))) .collect(groupingBy(KeyDataPoint::getKey, mapping(kdp -> new DataPoint(kdp.getTimestamp(), kdp.getData()), toList()))) .entrySet().stream() .map(e -> new DataSet(e.getKey(), e.getValue())) .collect(toList()); }
I took some liberties with constructors and getters, but I think they should be obvious.
It's an interesting question, because it shows that there are a lot of different approaches to achieve the same result. Below I show three different implementations.
Default methods in Collection Framework: Java 8 added some methods to the collections classes, that are not directly related to the Stream API. Using these methods, you can significantly simplify the implementation of the non-stream implementation:
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) { Map<String, DataSet> result = new HashMap<>(); multiDataPoints.forEach(pt -> pt.keyToData.forEach((key, value) -> result.computeIfAbsent( key, k -> new DataSet(k, new ArrayList<>())) .dataPoints.add(new DataPoint(pt.timestamp, value)))); return result.values(); }
Stream API with flatten and intermediate data structure: The following implementation is almost identical to the solution provided by Stuart Marks. In contrast to his solution, the following implementation uses an anonymous inner class as intermediate data structure.
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) { return multiDataPoints.stream() .flatMap(mdp -> mdp.keyToData.entrySet().stream().map(e -> new Object() { String key = e.getKey(); DataPoint dataPoint = new DataPoint(mdp.timestamp, e.getValue()); })) .collect( collectingAndThen( groupingBy(t -> t.key, mapping(t -> t.dataPoint, toList())), m -> m.entrySet().stream().map(e -> new DataSet(e.getKey(), e.getValue())).collect(toList()))); }
Stream API with map merging: Instead of flattening the original data structures, you can also create a Map for each MultiDataPoint, and then merge all maps into a single map with a reduce operation. The code is a bit simpler than the above solution:
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) { return multiDataPoints.stream() .map(mdp -> mdp.keyToData.entrySet().stream() .collect(toMap(e -> e.getKey(), e -> asList(new DataPoint(mdp.timestamp, e.getValue()))))) .reduce(new HashMap<>(), mapMerger()) .entrySet().stream() .map(e -> new DataSet(e.getKey(), e.getValue())) .collect(toList()); }
You can find an implementation of the map merger within the Collectors class. Unfortunately, it is a bit tricky to access it from the outside. Following is an alternative implementation of the map merger:
<K, V> BinaryOperator<Map<K, List<V>>> mapMerger() { return (lhs, rhs) -> { Map<K, List<V>> result = new HashMap<>(); lhs.forEach((key, value) -> result.computeIfAbsent(key, k -> new ArrayList<>()).addAll(value)); rhs.forEach((key, value) -> result.computeIfAbsent(key, k -> new ArrayList<>()).addAll(value)); return result; }; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With