Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 partition list into groups by condition involving previous elements

Suppose I have a list of intervals (sorted by start) and I want to break them up so that I have a list of overlapping groups of intervals. So, for example, with Interval as:

public class Interval {
    private final int start;
    private final int end;

    public Interval(int start,int end){
        this.start = start;
        this.end = end;
    }

    public int getStart(){return start;}
    public int getEnd(){return end;}

    public String toString(){ return "("+start+","+end+")"; }
}

And a List<Interval> like:

[(0,4),(1,7),(6,10),(13,17),(20,100),(22,31),(60,65)]

I want an output of List<List<Interval>>:

[[(0,4),(1,7),(6,10)],[(13,17)],[(20,100),(22,31),(60,65)]]

I can code this up, but I'm really enjoying the more functional approach of Java 8, and want to know if there is anything like an idiomatic way to do this using Java 8 streams.

I've had a look at the "group by" styles of provided Collectors, but they don't seem to apply since I'm not really grouping by a classifier- you can't compute the groups based only on a property of each individual element, you have to consider the properties of each element in relation to the groups that have been computed so far.

Certainly there is non-crazy way to do this in functional languages (though I speak as someone who isn't really a functional programmer :-) ). How can I do it with streams in Java 8?

like image 709
codeCogs Avatar asked Mar 10 '16 16:03

codeCogs


2 Answers

Your were looking at the right place when studying the groupingBy collectors, but you are also right that they won’t provide the necessary logic for merging intervals. But they are conceptionally merging elements into the state created by previous elements. You have to implement a similar collector yourself.

Relying on your specification that the elements are already presorted by their start index, you can do it like:

Comparator<Interval> byStart = Comparator.comparingInt(Interval::getStart);
Comparator<Interval> byEnd   = Comparator.comparingInt(Interval::getEnd);
Collection<List<Interval>> merged = intervalList.stream().collect(
        () -> new TreeMap<Interval,List<Interval>>(byStart),
        (map,i) -> {
            Map.Entry<Interval,List<Interval>> e=map.floorEntry(i);
            if(e!=null && Collections.max(e.getValue(), byEnd).getEnd()>=i.getStart())
                e.getValue().add(i);
            else map.computeIfAbsent(i, x->new ArrayList<>()).add(i);
        },
        (m1,m2) -> m2.forEach((i,list) -> {
            Map.Entry<Interval,List<Interval>> e=m1.floorEntry(i);
            if(e!=null && Collections.max(e.getValue(), byEnd).getEnd()>=i.getStart())
                e.getValue().addAll(list);
            else m1.put(i, list);
        })
    ).values();

This creates a Collection rather than a List, but you can simply create a List out of it:

List<List<Interval>> list = new ArrayList<>(merged);

You should do that definitely if you intend to keep the result for a longer time rather than processing it immediately, as the Collection returned by the collector is a view into a TreeMap holding more resources than necessary.

I guess, in most cases you are better off with a loop based solution.

like image 187
Holger Avatar answered Nov 15 '22 12:11

Holger


You cannot. Streams are not suited to this sort of problem; streams have no notion of "previous elements" and are allowed to operate over elements in arbitrary order. You can do it in Java, sure, and you can do it in functional languages, but that doesn't mean streams work like the functional language data structures you're used to.

like image 4
Louis Wasserman Avatar answered Nov 15 '22 12:11

Louis Wasserman