I see some posts on StackOverflow that contradict each other, and I would like to get a definite answer.
I started with the assumption that using a Java InputStream would allow me to stream bytes out of a file, and thus save on memory, as I would not have to consume the whole file at once. And that is exactly what I read here:
Loading all bytes to memory is not a good practice. Consider returning the file and opening an input stream to read it, so your application won't crash when handling large files. – andrucz
Download file to stream instead of File
But then I used an InputStream to read a very large Microsoft Excel file (using the Apache POI library) and I ran into this error:
java.lang.outofmemory exception while reading excel file (xlsx) using POI
I got an OutOfMemory error.
And this crucial bit of advice saved me:
One thing that'll make a small difference is when opening the file to start with. If you have a file, then pass that in! Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't!
I got rid of the InputStream and just used a bare java.io.File, and then the OutOfMemory error went away.
So using java.io.File is better than an InputSteam, when it comes to memory use? That doesn't make any sense.
What is the real answer?
Note that InputStream is an abstract class and exact details depend on the specific subclass. For example ByteArrayInputStream holds everything in memory, FileInputStream holds nothing in memory, while BufferedInputStream only holds a buffer of N number of bytes in memory.
The InputStream is used to read data from a source and the OutputStream is used for writing data to a destination. Here is a hierarchy of classes to deal with Input and Output streams.
resource-leak is probably more of a concern here. Handling inputstream requires OS to use its resources and if you don't free it up once you use it, you will eventually run out of resources.
1.1 InputStream: InputStream is an abstract class of Byte Stream that describe stream input and it is used for reading and it could be a file, image, audio, video, webpage, etc. it doesn't matter. Thus, InputStream read data from source one item at a time.
So you are saying that an
InputStream
would typically help?
It entirely depends on how the application (or library) >>uses<< the InputStream
With what kind of follow up code? Could you offer an example of memory efficient Java?
For example:
// Efficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String line;
while ((line = br.readLine()) != null) {
// process one line
}
}
// Inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
StringBuilder sb = new StringBuilder();
while ((line = br.readLine()) != null) {
sb.append(line).append("\n");
}
String everything = sb.toString();
// process the entire string
}
// Very inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String everything = "";
while ((line = br.readLine()) != null) {
everything += line + "\n";
}
// process the entire string
}
(Note that there are more efficient ways of reading a file into memory. The above examples are purely to illustrate the principles.)
The general principles here are:
The posts that you linked to above:
The first one is not really about memory efficiency. Rather it is talking about a limitation of the AWS client-side library. Apparently, the API doesn't provide an easy way to stream an object while reading it. You have to save it the object to a file, then open the file as a stream. Whether that is memory efficient or not depends on what the application does with the stream; see above.
The second one specific to the POI APIs. Apparently, the POI library itself is reading the stream contents into memory if you use a stream. That would be an implementation limitation of that particular library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With