I am running into an OOM when reading a large number of objects from an ObjectInputStream
with readUnshared
. MAT points at its internal handle table as the culprit, as does the OOM stack trace (at end of this post). By all accounts, this shouldn't be happening. Furthermore, whether or not the OOM occurs appears to depend on how the objects were written previously.
According to this write-up on the topic, readUnshared
should solve the issue (as opposed to readObject
) by not creating handle table entries during read (that write-up is how I discovered writeUnshared
and readUnshared
, which I previously had not noticed).
However, it appears from my own observations that readObject
and readUnshared
behave identically, and whether the OOM happens or not depends on if the objects were written with a reset()
after each write (it does not matter if writeObject
vs writeUnshared
was used, as I previously thought -- I was just tired when I first ran the tests). That is:
writeObject writeObject+reset writeUnshared writeUnshared+reset readObject OOM OK OOM OK readUnshared OOM OK OOM OK
So whether or not readUnshared
has any effect actually seems to be completely dependent on how the object was written. This is surprising and unexpected to me. I did spend some time tracing through the readUnshared
code path but, and granted it was late and I was tired, it wasn't apparent to me why it would still be using handle space and why it would depend on how the object was written (however, I now have an initial suspect although I have yet to confirm, described below).
From all of my research on the topic so far, it appears writeObject
with readUnshared
should work.
Here is the program I've been testing with:
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.EOFException;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
public class OOMTest {
// This is the object we'll be reading and writing.
static class TestObject implements Serializable {
private static final long serialVersionUID = 1L;
}
static enum WriteMode {
NORMAL, // writeObject
RESET, // writeObject + reset each time
UNSHARED, // writeUnshared
UNSHARED_RESET // writeUnshared + reset each time
}
// Write a bunch of objects.
static void testWrite (WriteMode mode, String filename, int count) throws IOException {
ObjectOutputStream out = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(filename)));
out.reset();
for (int n = 0; n < count; ++ n) {
if (mode == WriteMode.NORMAL || mode == WriteMode.RESET)
out.writeObject(new TestObject());
if (mode == WriteMode.UNSHARED || mode == WriteMode.UNSHARED_RESET)
out.writeUnshared(new TestObject());
if (mode == WriteMode.RESET || mode == WriteMode.UNSHARED_RESET)
out.reset();
if (n % 1000 == 0)
System.out.println(mode.toString() + ": " + n + " of " + count);
}
out.close();
}
static enum ReadMode {
NORMAL, // readObject
UNSHARED // readUnshared
}
// Read all the objects.
@SuppressWarnings("unused")
static void testRead (ReadMode mode, String filename) throws Exception {
ObjectInputStream in = new ObjectInputStream(new BufferedInputStream(new FileInputStream(filename)));
int count = 0;
while (true) {
try {
TestObject o;
if (mode == ReadMode.NORMAL)
o = (TestObject)in.readObject();
if (mode == ReadMode.UNSHARED)
o = (TestObject)in.readUnshared();
//
if ((++ count) % 1000 == 0)
System.out.println(mode + " (read): " + count);
} catch (EOFException eof) {
break;
}
}
in.close();
}
// Do the test. Comment/uncomment as appropriate.
public static void main (String[] args) throws Exception {
/* Note: For writes to succeed, VM heap size must be increased.
testWrite(WriteMode.NORMAL, "test-writeObject.dat", 30_000_000);
testWrite(WriteMode.RESET, "test-writeObject-with-reset.dat", 30_000_000);
testWrite(WriteMode.UNSHARED, "test-writeUnshared.dat", 30_000_000);
testWrite(WriteMode.UNSHARED_RESET, "test-writeUnshared-with-reset.dat", 30_000_000);
*/
/* Note: For read demonstration of OOM, use default heap size. */
testRead(ReadMode.UNSHARED, "test-writeObject.dat"); // Edit this line for different tests.
}
}
Steps to recreate issue with that program:
testWrite
s uncommented (and testRead
not called) with the heap size set high, so writeObject
does not lead to OOM.testRead
uncommented (and testWrite
not called) with the default heap size.To be clear: I'm not doing the writing and reading in the same JVM instance. My writes happen in a separate program from my reads. The test program above may be slightly misleading at first glance due to the fact that I crammed both the write and read tests into the same source.
Unfortunately, the real situation I'm in is I have a file containing a lot of objects written with writeObject
(without reset
), which will take quite some time to regenerate (on the order of days) (and also the reset
makes the output files massive), so I'd like to avoid that if possible. On the other hand, I cannot currently read the file with readObject
, even with the heap space cranked up to the maximum available on my system.
It's worth noting that in my real situation, I do not need the caching provided by the object stream handle tables.
So my questions are:
readUnshared
's behavior and how the objects were written. What is going on here?writeObject
and no reset
?I'm not entirely sure why readUnshared
is failing to resolve the issue here.
I hope this is clear. I am running on empty here so may have typed strange words.
From comments on an answer below:
If you're not calling
writeObject()
in the current instance of the JVM you should not be consuming memory by callingreadUnshared()
.
All my research shows the same, and yet, confusingly:
Here is the OOM stack trace, pointing at readUnshared
:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readUnshared(ObjectInputStream.java:460)
at OOMTest.testRead(OOMTest.java:40)
at OOMTest.main(OOMTest.java:54)
Here is a video of it happening (video recorded before recent test program edit, video is equivalent of ReadMode.UNSHARED
and WriteMode.NORMAL
in new test program).
Here are some test data files, which contain 30,000,000 objects (compressed size is a tiny 360 KB but be warned it expands to a whopping 2.34 GB). There are four test files here, each generated with various combinations of writeObject
/writeUnshared
and reset
. The read behavior is dependent only on how it was written and independent of readObject
vs. readUnshared
. Note that the writeObject
vs writeUnshared
data files are byte-for-byte identical, I can't decide if this is surprising or not.
I've been staring at the ObjectInputStream
code from here. My current suspect is this line, present in 1.7 and 1.8:
ObjectStreamClass desc = readClassDesc(false);
Where that boolean
parameter is true
for unshared and false
for normal. In all other cases the "unshared" flag is propagated through to other calls, but in that case it's hard-coded to false
, thus causing handles to be added to the handle table when reading class descriptions for serialized objects even when readUnshared
is used. AFAICT, this is the only occurrence of the unshared flag not being passed through to other methods, hence why I am focused on it.
This is in contrast to e.g. this line where the unshared flag is passed through to readClassDesc
. (You can trace the call path from readUnshared
to both of those lines if anybody wishes to dig in.)
However, I have not yet confirmed that any of this is significant, or reasoned why false
is hard-coded there. This is just the current track I'm taking looking into this, it may prove meaningless.
Also, fwiw, ObjectInputStream
does have a private method, clear
, that clears the handle table. I did an experiment where I called that (via reflection) after every read, but it just broke everything, so that's a no-go.
However, it seems that if the objects were written using
writeObject()
rather thanwriteUnshared()
, thenreadUnshared()
does not decrease handle table usage.
That is correct. readUnshared()
only decreases handle table usage attributable to readObject()
. If you are in the same JVM that is using writeObject()
rather than writeUnshared()
, handle table usage attributable to writeObject()
is not decreased by readUnshared()
.
writeUnShared()
still write a null
into its handlers
which'll grow as you write more object. that's the reason you got OOM on readUnShared
.
check this: OutOfMemoryException : Memory leak in the java class ObjectOutputStream and ObjectInputStream
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With