I have a situation where I am reading a database and returning a List<String>
, where each string is selected and added to the list according to some criteria. The method signature is:
public List<String> myMethod(String query, int limit)
The second parameter provides an upper bound on the size of the returned list (setting limit=-1
will remove any size restriction). To avoid making this method memory-intensive, I have written an equivalent method that returns Stream<String>
instead of a list. ( Note: I don't need random access to the returned elements or any other list-specific functionality. )
However, I am a bit skeptical about returning a Stream<>
, especially since the method is public. Is it safe to have a public method returning a Stream<>
in Java?
For most of the cases you should return Stream . It is more flexible, is designed for better performance, and can be easily turned into Collection . You should return Collection when there are strong consistency requirements and you have to produce snapshot of a moving target.
Generating Streams With Java 8, Collection interface has two methods to generate a Stream. stream() − Returns a sequential stream considering collection as its source. parallelStream() − Returns a parallel Stream considering collection as its source.
Streams are not serializable by default. Could be possible, the stream is linked to a non-serialized underlying data structure, which is not returned.
A stream should be operated on (invoking an intermediate or terminal stream operation) only once. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. So the answer is no, streams are not meant to be reused.
Not only is it safe, it is recommended by the chief Java architect.
Especially if your data is I/O-based and thus not yet materialized in memory at the time myMethod
is called, it would be highly advisable to return a Stream instead of a List. The client may need to only consume a part of it or aggregate it into some data of fixed size. Thus you have the chance to go from O(n) memory requirement to O(1).
Note that if parallelization is also an interesting idea for your use case, you would be advised to use a custom spliterator whose splitting policy is adapted to the sequential nature of I/O data sources. In this case I can recommend a blog post of mine which presents such a spliterator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With