Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does streams in Java affect memory consumption?

Tags:

java

memory

I have been using streams many times but I never read much about how they actually works. Nor do I know much about them other than that a stream is just a metaphor. A stream only represent a sequence of bytes. I don't know much about how they actually work, I guess opening a file stream in Java interact with the OS that have the functionality to give a "pointer" to a stream.

Basically my question is how streams affect memory consumption. When you have for instance a input stream and you start reading from it you only start increasing the memory consumption with the amount of bytes read? When opening a stream in Java you don't actually load the full file before you start reading? If you read from one stream and directly write to another stream you only increase the memory with the amount of bytes that you read (and potentially have in buffer)? If you read bytes to an byte array in java then you increase the memory consumption with the size of the file?

May sound like an odd question but I could need some guidance/correction on my understanding. Thanks.

like image 573
LuckyLuke Avatar asked Aug 04 '13 13:08

LuckyLuke


People also ask

Does Java stream save memory?

No storage. Streams don't have storage for values; they carry values from a source (which could be a data structure, a generating function, an I/O channel, etc) through a pipeline of computational steps.

Are streams more efficient than for loops Java?

If you have a small list, loops perform better. If you have a huge list, a parallel stream will perform better. Purely thinking in terms of performance, you shouldn't use a for-each loop with an ArrayList, as it creates an extra Iterator instance that you don't need (for LinkedList it's a different matter).

Is stream Java slow?

Yes, streams are sometimes slower than loops, but they can also be equally fast; it depends on the circumstances. The point to take home is that sequential streams are no faster than loops.


2 Answers

All of the answers above are great answers but I don't believe they answer your original question about memory consumption.

In Java you can look at streams in multiple ways. First you have Raw streams which are the lowest level stream and interact with the underlying OS (File, Network etc) with minimal memory overhead. Second are Buffered streams which can be used to wrap a raw stream and add some buffering and significantly improve performance. Stream buffering adds a fixed amount of memory overhead for buffering and can be set by your application. Not sure what the default is but it is probably something minimal such as 32K.

The third type of stream is a memory stream (i.e. ByteArrayInput/Ouput) these use as much memory as you write to them and will grow as needed and not dispose of their memory until the reference count goes to zero (they are no longer used). These streams are very useful but obviously can consume a lot of memory.

The final type is really not a stream but is a class of I/O called Readers which provide assistance with data conversion to and from a stream as was pointed out above. These streams operate on either a raw. buffered or memory stream and will consume as much memory as the underlying stream that is being used.

like image 144
Jim M. Avatar answered Oct 19 '22 20:10

Jim M.


There is almost no memory overhead after you start reading from the InputStream. There is very small OS-overhead for opening a file and a tiny overhead in the JVM for new object allocation. There also might be a small overhead in case you use BufferedInputStream which is 8KB by default.

The overhead for writing very much depends on where you write to. If it's a FileOutputStream, then it's the same as described above. If it's a ByteArrayOutputStream, then it's (2 * stream length) bytes in the best case and (3 * stream length) bytes in the worst case scenario. I.e. to copy 10k bytes from an InputStream into a byte array 30k bytes will be allocated in the worst case.

The reason for this is that ByteArrayOutputStream size growth 2 times after it's limit is reached and it also allocates a new buffer when you call toByteArray().

like image 4
Andrey Chaschev Avatar answered Oct 19 '22 20:10

Andrey Chaschev