Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 Stream vs Collection Storage

I have been reading up on Java 8 Streams and the way data is streamed from a data source, rather than have the entire collection to extract data from.

This quote in particular I read on an article regarding streams in Java 8.

No storage. Streams don't have storage for values; they carry values from a source (which could be a data structure, a generating function, an I/O channel, etc) through a pipeline of computational steps.

I understand the concept of streaming data in from a source piece by piece. What I don't understand is if you are streaming from a collection how is there no storage? The collection already exists on the Heap, you are just streaming the data from that collection, the collection already exists in "storage".

What's the difference memory-footprint wise if I were to just loop through the collection with a standard for loop?

like image 538
user3587411 Avatar asked Apr 22 '15 04:04

user3587411


People also ask

Does streams in Java 8 have limited storage?

No storage. Streams don't have storage for values; they carry values from a source (which could be a data structure, a generating function, an I/O channel, etc) through a pipeline of computational steps.

What is the difference between collections and stream in Java 8?

Differences between a Stream and a Collection: A stream does not store data. An operation on a stream does not modify its source, but simply produces a result. Collections have a finite size, but streams do not.

Which is faster stream or collection?

For this particular test, streams are about twice as slow as collections, and parallelism doesn't help (or either I'm using it the wrong way?).

How is stream different from collections?

A collection is an in-memory data structure, which holds all the values that the data structure currently has—every element in the collection has to be computed before it can be added to the collection. In contrast, a stream is a conceptually fixed data structure in which elements are computed on demand.


1 Answers

The statement about streams and storage means that a stream doesn't have any storage of its own. If the stream's source is a collection, then obviously that collection has storage to hold the elements.

Let's take one of examples from that article:

int sum = shapes.stream()
                .filter(s -> s.getColor() == BLUE)
                .mapToInt(s -> s.getWeight())
                .sum();

Assume that shapes is a Collection that has millions of elements. One might imagine that the filter operation would iterate over the elements from the source and create a temporary collection of results, which might also have millions of elements. The mapToInt operation might then iterate over that temporary collection and generate its results to be summed.

That's not how it works. There is no temporary, intermediate collection. The stream operations are pipelined, so elements emerging from filter are passed through mapToInt and thence to sum without being stored into and read from a collection.

If the stream source weren't a collection -- say, elements were being read from a network collection -- there needn't be any storage at all. A pipeline like the following:

int sum = streamShapesFromNetwork()
                .filter(s -> s.getColor() == BLUE)
                .mapToInt(s -> s.getWeight())
                .sum();

might process millions of elements, but it wouldn't need to store millions of elements anywhere.

like image 167
Stuart Marks Avatar answered Oct 05 '22 03:10

Stuart Marks