Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collecting stream back into the same collection type

Suppose I have a collection of the unknown type. What I want to do is stream it, do some stuff on the stream, and collect it back into the same collection type as my original collection. For instance:

Collection<? extends Integer> getBigger(Collection<? extends Integer> col, int value) {
    return col.stream().filter(v -> v > value).collect(????);
} 

The idea of this incomplete code example is to return a List if col is of List class (or any subclass of it), a Set if col is of Set class, etc... The method name and actual operations on the stream here are not important, I've specified them just to illustrate my question. So, is it possible?

like image 898
Leonid Avatar asked Apr 08 '14 15:04

Leonid


People also ask

Is it good practice to return stream?

For most of the cases you should return Stream . It is more flexible, is designed for better performance, and can be easily turned into Collection .

Can streams be reused?

A stream should be operated on (invoking an intermediate or terminal stream operation) only once. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. So the answer is no, streams are not meant to be reused.

Which is the correct way of obtaining a stream from the collection?

You obtain a stream from a collection by calling the stream() method of the given collection. Here is an example of obtaining a stream from a collection: List<String> items = new ArrayList<String>(); items. add("one"); items.

What is the difference between Stream and collection?

Differences between a Stream and a Collection: A stream does not store data. An operation on a stream does not modify its source, but simply produces a result. Collections have a finite size, but streams do not.


2 Answers

It is not possible without violating the principle on which the Java streams framework has been built on. It would completely violate the idea of abstracting the stream from its physical representation.

The sequence of bulk data operations goes in a pipeline, see the following picture: Pipeline: A Sequence of Bulk Data Operations

The stream is somehow similar to the Schrödinger's cat - it is not materialized until you call the terminal operation. The stream handling is completely abstract and detached from the original stream source.

Pipeline as a Black Box

If you want to work so low-level with your original data storage, don't feel ashamed simply avoiding the streams. They are just a tool, not anything sacred. By introducing streams, the Good Old Collections are still as good as they were, with added value of the internal iteration - the new Iterable.forEach() method.


Added to satisfy your curiosity :)

A possible solution follows. I don't like it myself, and I have not been able to solve all the generics issues there, but it works with limitations.

The idea is creating a collector returning the same type as the input collection. However, not all the collections provide a nullary constructor (with no parameters), and without it the Class.newInstance() method does not work. There is also the problem of the awkwardness of checked exceptions within lambda expression. (It is mentioned in this nice answer here: https://stackoverflow.com/a/22919112/2886891)

public Collection<Integer> getBiggerThan(Collection<Integer> col, int value) {
    // Collection below is an example of one of the rare appropriate 
    // uses of raw types. getClass returns the runtime type of col, and 
    // at runtime all type parameters have been erased.
    @SuppressWarnings("rawtypes")
    final Class<? extends Collection> clazz = col.getClass();
    System.out.println("Input collection type: " + clazz);
    final Supplier<Collection<Integer>> supplier = () -> {
        try {
            return clazz.newInstance();
        }
        catch (InstantiationException | IllegalAccessException e) {
            throw new RuntimeException(
                    "A checked exception caught inside lambda", e);
        }
    };
    // After all the ugly preparatory code, enjoy the clean pipeline:
    return col.stream()
            .filter(v -> v > value)
            .collect(supplier, Collection::add, Collection::addAll);
}

As you can see, it works in general, supposed your original collection provides a nullary constructor.

public void test() {
    final Collection<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

    final Collection<Integer> arrayList = new ArrayList<>(numbers);
    final Collection<Integer> arrayList2 = getBiggerThan(arrayList, 6);
    System.out.println(arrayList2);
    System.out.println(arrayList2.getClass());
    System.out.println();

    final Collection<Integer> set = new HashSet<>(arrayList);
    final Collection<Integer> set2 = getBiggerThan(set, 6);
    System.out.println(set2);
    System.out.println(set2.getClass());
    System.out.println();

    // This does not work as Arrays.asList() is of a type
    // java.util.Arrays$ArrayList which does not provide a nullary constructor
    final Collection<Integer> numbers2 = getBiggerThan(numbers, 6);
}
like image 168
Honza Zidek Avatar answered Oct 13 '22 18:10

Honza Zidek


There are two issues here: (1) the runtime type (class) of the input and its result, and (2) the compile-time type of the input and its result.

For (1), it may seem strange, but in general it's not possible in Java to create a copy of an instance of an arbitrary class. Using getClass().newInstance() might not work if the class doesn't have an accessible no-arg constructor or if it's immutable. The object might not be Cloneable either. Thus, the caller needs to pass in a supplier that's responsible for creating an instance of the right result class.

For (2), a suitable dose of generics can make this type-safe at compile time.

<T extends Comparable<T>, C extends Collection<T>> C getBigger(
        C col, T value, Supplier<C> supplier) {
    return col.stream()
              .filter(v -> v.compareTo(value) > 0)
              .collect(Collectors.toCollection(supplier::get));
}

Note that there is a bound of Comparable<T> on the type parameter T so that the caller is restricted to passing a collection of things that are comparable. This lets us use compareTo to compare the values. We also use the Collectors.toCollection method and pass the supplier's get method through to it.

Examples of use:

List<Integer> input1 = Arrays.asList(1, 4, 9, 13, 14, 22);
List<Integer> filtered1 = getBigger(input1, 10, ArrayList::new);

Set<String> input2 = new HashSet<>();
input2.add("foo");
input2.add("bar");
input2.add("baz");
input2.add("qux");
Set<String> filtered2 = getBigger(input2, "c", HashSet::new);
like image 25
Stuart Marks Avatar answered Oct 13 '22 20:10

Stuart Marks