One may suggest that BufferedImage
is the best option to process an image in Java. While it's convenient, when reading huge images it often ends up in:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
Increasing VM size isn't a solution since some input files are really huge in my case.
So I'm looking for the way(s) how an image can be read progressively, from a stream.
I suspect that ImageIO.createImageInputStream()
from the ImageIO
might fit the bill, but I'm not sure how to use it to read the chunks progressively.
Also, there are the classes PNGMetadata
& PNGImageReader
available on the JDK's rt.jar
which seem to be useful, but I didn't find simple examples of their usage.
Is this the way to go, or there are better alternatives?
The Java APIs for reading and manipulating images aren't really stream-based in the way you seem to think. The ImageInputStream
is just a convenience wrapper that allows reading byte
s and other primitive types from different inputs (RandomAccessFile
s, InputStream
s etc).
I've been thinking about creating an API for reading "pixel streams", to allow chaining of processing filters without using much memory. But the need has never been serious enough to be worth the effort. Feel free to hire me, if you like to hear more ideas or have a working implementation. ;-)
Still, as I see it, you have multiple options to achieve your ultimate goal, to be able to process large images:
Use BufferedImage
as is, and the ImageIO
API to read an image in smaller parts to conserve memory. This will be quite efficient for some formats, less efficient for other formats, due to the implementation (i.e. the default JPEGImageReader
will read the entire image in native memory before handing smaller region over to the Java heap, but the PNGImageReader
might be okay).
Something along the lines of:
ImageInputStream stream = ImageIO.createImageInputStream(input);
ImageReader reader = ImageIO.getImageReaders(stream).next(); // TODO: Test hasNext()
reader.setInput(stream);
int width = reader.getWidth(0);
int height = reader.getHeight(0);
ImageReadParam param = reader.getDefaultReadParam();
for (int y = 0; y < height; y += 100) {
for (int x = 0; x < width; x += 100) {
param.setSourceRegion(new Rectangle(x, y, 100, 100)); // TODO: Bounds check
// Read a 100 x 100 tile from the image
BufferedImage region = reader.read(0, param);
// ...process region as needed...
}
}
Read the entire image at once, into an memory mapped buffer. Feel free to try some experimental classes I've made for this purpose (using nio
). Reading will be slower than reading to a pure memory image, and processing will also be slower. But if you are doing computations on smaller regions of the image at a time, it could be about as fast as in-memory with some optimizations. I've read > 1 GB images into a 32 MB JVM using these classes (real memory consumption is of course far greater).
Again, here's an example:
ImageInputStream stream = ImageIO.createImageInputStream(input);
ImageReader reader = ImageIO.getImageReaders(stream).next(); // TODO: Test hasNext()
reader.setInput(stream);
int width = reader.getWidth(0);
int height = reader.getHeight(0);
ImageTypeSpecifier spec = reader.getImageTypes(0).next(); // TODO: Test hasNext();
BufferedImage image = MappedImageFactory.createCompatibleMappedImage(width, height, spec)
ImageReadParam param = reader.getDefaultReadParam();
param.setDestination(image);
image = reader.read(0, param); // Will return same image as created above
// ...process image as needed...
Some formats, like uncompressed TIFF, BMP, PPM etc, keeps the pixels in the file in a way that it would be possible to memory-map them directly to manipulate them. Requires some work, but should be possible. TIFF also supports tiles that might be of help. I'll leave this option as an exercise, feel free to use the classes I linked above as a starting point or inspiration. ;-)
JAI may have something that can help you. I'm not a big fan, due to it's many unresolved bugs and lack of love and development by Oracle. But worth checking out. I think they have support for tileable and disk-based RenderedImage
s as well. Again, I'll leave this option for you to explore further.
The memory problem is surely related not with the decoding process itself but with storing the full image in memory as a BufferedImage
. It is possible to read a PNG image progressively, but:
This is only marginally related to the organization in "chunks", more so with the fact that PNG files are encoded line-by-line, and so they can be -in principle- read line-by-line.
The above assumption breaks in the case of interlaced PNGs - but one should not expect to have huge PNG images stored in interlaced format
While some PNG libraries allow for progressive (line-by-line) decoding (eg: libpng), the standard Java API does not give you that.
I was faced with that problem and I ended up coding my own Java library: PNGJ. It's quite mature, it allows reading PNG images line by line, minimizing memory consumption, and write them in the same way (Even interlaced PNGs can be read, but in this case the memory problem will not go away.) If you only need to do some "local" image processing (modify each pixel value depending on the current value and the neighbours, and write it back) this should help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With